332
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS,
VOL. 18, NO. 2,
FEBRUARY 2012
DVV: A Taxonomy for Mixed Reality Visualization in Image Guided Surgery Marta Kersten-Oertel, Pierre Jannin, and D. Louis Collins Abstract—Mixed reality visualizations are increasingly studied for use in image guided surgery (IGS) systems, yet few mixed reality systems have been introduced for daily use into the operating room (OR). This may be the result of several factors: the systems are developed from a technical perspective, are rarely evaluated in the field, and/or lack consideration of the end user and the constraints of the OR. We introduce the Data, Visualization processing, View (DVV) taxonomy which defines each of the major components required to implement a mixed reality IGS system. We propose that these components be considered and used as validation criteria for introducing a mixed reality IGS system into the OR. A taxonomy of IGS visualization systems is a step toward developing a common language that will help developers and end users discuss and understand the constituents of a mixed reality visualization system, facilitating a greater presence of future systems in the OR. We evaluate the DVV taxonomy based on its goodness of fit and completeness. We demonstrate the utility of the DVV taxonomy by classifying 17 state-of-the-art research papers in the domain of mixed reality visualization IGS systems. Our classification shows that few IGS visualization systems’ components have been validated and even fewer are evaluated. Index Terms—Taxonomy, mixed reality, augmented reality, augmented virtuality, visualization, image guided surgery.
Ç 1
INTRODUCTION
M
reality visualizations have become a focus for research in the medical domain for surgical training, planning, diagnosis, and guidance. Their purpose in image guided surgery (IGS) is to improve the understanding of complex multimodal data. This is typically achieved by registering preoperative data sets of multiple modalities, such as functional data, vasculature, organs, bones, and atlases to each other as well as to intraoperative data sets and to the patient in the operating room (OR). With such systems, preoperative plans, patient models, and graphical representations of the surgical tools and instruments that are localized in real time in the OR are displayed to guide the surgeon in their task. The tools and data sets are fused into a mixed reality, providing the surgeon a more extensive view beyond the visible anatomical surface of the patient, thereby reducing patient trauma, and potentially improving clinical outcomes. For an overview of IGS applications we refer readers to the review by Cleary and Peters [1]. Many new mixed reality systems and visualization techniques have been proposed, few systems are consistently used for IGS, and even fewer are developed for commercial use. Although it is becoming more common for commercial IGS systems to make augmented reality IXED
. M. Kersten-Oertel and D.L. Collins are with the McConell Brain Imaging Center at the Montreal Neurological Institute (MNI), 3801 University St, Montre´al, QC H3A 2B4, Canada. E-mail:
[email protected],
[email protected]. . P. Jannin is with the Projet-Unite´ Visages-U746 INRIA, INSERM, CNRS, Universite´ de Rennes 1, Faculte´ de Me´decine, 2 Avenue du Pr. Le´on Bernard, CS 34317, Rennes Cedex 35043, France. E-mail:
[email protected]. Manuscript received 3 Aug. 2010; revised 8 Dec. 2010; accepted 20 Jan. 2011; published online 2 Mar. 2011. Recommended for acceptance by D. Schmalstieg. For information on obtaining reprints of this article, please send e-mail to:
[email protected], and reference IEEECS Log Number TVCG-2010-08-0171. Digital Object Identifier no. 10.1109/TVCG.2011.50. 1077-2626/12/$31.00 ß 2012 IEEE
visualizations available (i.e., BrainLab,1 Medtronic,2 da Vinci3), it is not common to see them used in routine clinical practice. One exception is at the University Hospital of Cranio-Maxillofacial and Oral Surgery, were augmented reality technology has been used in routine clinical applications since 1995 [2]. There are several plausible reasons that more systems are not widely used: the systems are often developed as proof-of-concept prototypes and therefore may not take into account the direct clinical needs of the surgeons, they do not take into account day-to-day operational constraints imposed by the OR and the surgeon, the added benefit is for less experienced surgeons or surgical residents, and more often than not, the systems are not sufficiently evaluated and therefore are not convincing in proving their value for given surgical tasks. For these reasons, it is still rare to see the integration of mixed reality systems and technologies integrated into real clinical environments and workflows [3]. The objective of this paper is to define a taxonomy4 of mixed reality visualization techniques used in IGS based on the type of data and type of processing the data undergoes before being presented to the end user by means of the view. The purpose of defining such a taxonomy is twofold: 1) to introduce a syntax and framework with which mixed reality IGS systems may be discussed, analyzed, and evaluated and 2) to allow a better understanding of the most relevant and important components of a mixed reality visualization system. In developing a common language, past and current systems can more easily be compared and analyzed. In our 1. http://www.brainlab.com/. 2. http://www.medtronicdiabetes.ca/. 3. http://www.intuitivesurgical.com/. 4. The term taxonomy is used rather than classification or ontology as to be consistent with the literature. Further, although we provide relationships between the concepts in our taxonomy, we do not provide a complete ontology. Published by the IEEE Computer Society
KERSTEN-OERTEL ET AL.: DVV: A TAXONOMY FOR MIXED REALITY VISUALIZATION IN IMAGE GUIDED SURGERY
workshop paper [4] we briefly outlined the Data, Visualization processing, View (DVV) taxonomy. Here, we clearly define, discuss, and give examples of each of the major components of a mixed reality IGS system that should be considered and in turn validated for acceptance of the system in the OR. We propose that the components defined by the DVV taxonomy should be discussed and compared with those that have been developed in previous works, and the effectiveness of their use should be quantified. Proper analysis and validation of the major entities of current systems will allow future developers to more easily recognize which components can be reused and which need improvement or new solutions. Such an analysis and evaluation should facilitate a greater presence of future systems in the OR. We demonstrate the usefulness of the DVV taxonomy by classifying 17 state-of-the-art publications which describe mixed reality IGS visualization systems. The papers were chosen to demonstrate the most active groups and authors from a database of 109 papers from the field of IGS visualization systems. We begin to verify the DVV taxonomy based on its goodness of fit (i.e., how well it describes mixed reality visualization IGS systems within the literature) and its completeness and we compare the components of our taxonomy to those of the image-guided surgery toolkit (IGSTK). The remainder of the paper is organized as follows: We begin by defining mixed reality visualization. This is followed by a discussion of previous works on visualization taxonomies in Section 2. In Section 3, we propose our taxonomy and define the major components. This is followed by an example of how the taxonomy may be used; the taxonomy is instantiated by classifying 15 IGS publications in Section 4. We begin a verification of the DVV taxonomy in Section 5. In Section 6, we explore the components that were not well described in the chosen publications and point out avenues of future research efforts in the field of mixed-reality visualization systems for image-guided surgery. Conclusions are given in Section 7. Mixed reality visualization definition. As our taxonomy focuses on mixed reality in IGS, we begin by defining mixed reality. Mixed reality has been considered as the area on the reality-virtuality continuum [5] between reality, the unmodeled real environment, and virtual reality (VR), a purely virtual and modeled environment. The point on the continuum at which an environment lies will correspond to the extent to which the environment is modeled and whether real or virtual objects are introduced into this environment. Upon this continuum lie augmented reality (AR) and augmented virtuality (AV). In AR the perceptual environment is real; it is a physical location in four dimensions (three spatial and time) and virtual objects are added to this real environment. In IGS, the OR, the medical personnel, as well as the objects within the OR, such as the patient, the surgical table and tools, and computer monitors are part of the real environment. We define virtual objects as digital representations or models of real objects. An example of an AR system used for neurosurgery comes from Das et al. [6] who projected an MR image onto the skull of a patient giving the surgeon the
333
Fig. 1. An example of AR and AV in an image-guided neurosurgery [9]. Left: In AR case, virtual contours of preoperative data are superimposed onto the real anatomical surface of the patient using the microscope oculars. Right: In AV, the image of the real visible brain surface is registered to the virtual preoperative models of the patient on the digital monitor.
ability to see through the skull to the brain tissue. In this case, the MR image is the virtual object that is projected onto the real patient. For more information about AR, the readers are referred to the review by Azuma [7] and a surgeon’s perspective or AR by Shuhaiber [8]. In AV, real objects, which are objects with a physical entity that conform to the laws of physics and time, are introduced into a virtual environment. A virtual environment is a simulated digital environment. An example of an AV system used for neurosurgery comes from Paul et al. [9] who took live images from the microscope oculars and registered and displayed them onto the preoperative images of a patient model. In this case, the live images are real objects that are introduced into the synthetic scene of the preoperative images. Fig. 1 shows an example of AR and AV in image-guided neurosurgery [9], for the same patient at the same moment in time of the surgical procedure. In the AR case, virtual contours of preoperative data are superimposed onto the real anatomical surface of the patient using the microscope oculars. In the AV case, the image of the real visible brain surface is registered to the virtual preoperative models of the patient on the digital monitor.
2
PREVIOUS WORK
Previous visualization taxonomies and models have typically focused on only one or two aspects of the visualization: e.g., data type, display mode, interaction style, task, or design model. In the following section, we discuss a number of proposed visualization taxonomies and the particular aspects of the visualization process they emphasized. A number of papers have described one factor only, that of analytical tasks or interactions, as the basis of classifying visualization techniques. Chuah and Roth’s taxonomy [10] used the semantic primitive of basic visualization interaction to characterize tasks using a hierarchy to define what the basic primitives of more complex interactions are, how they are related and how they can be combined. Zhou and Feiner [11] extended this idea by categorizing the high-level presentation or visual implication of specific tasks, as well as the detailed subtasks or low-level techniques needed to achieve them. For example, the high-level presentation intent elaborate can be achieved using the low-level tasks of
334
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS,
emphasize and reveal. Buja et al. [12] also presented a taxonomy for visualization of high dimensional data based on interaction. In particular three data analytical tasks (finding Gestalt, posing queries, and making comparisons) and three classes of interactive view manipulations which support the data classes (focusing, linking, and arranging views) formed the basis of their taxonomy. While interaction is important, such a classification ignores the importance of data type, the visual representation of the data and the display mode. Tory and Mo¨ller’s taxonomy [13] also used one factor, but focused on models rather than data. They considered two criteria: 1) whether the object of study is discrete or continuous, and 2) which display attributes (e.g., spatiliazation, timing, color, etc.) are chosen by the designer. Their taxonomy considered the role of users and their conceptual models, however, they did not consider the display mode or interaction models. Others have used two factors, data and interaction technique, to classify information visualizations. Wehrend and Lewis [14] proposed a classification based on object classes and operation classes. Object classes define the data type whereas operation classes define the user’s goal or task. Thus, solutions for complex visualization problems can be broken down so that each subproblem is an entry in the Object Class/ Operation Class matrix. Similarily, Shneiderman [15] proposed a data type by task matrix, specifying seven data types: 1D, 2D, 3D, temporal/multidimensional data, and tree/ network data; and seven tasks: overview, zoom, filter, details-on-demand, relate, history, and extract, to classify information visualizations. Although such matrix solutions may provide references for available techniques, they are often not detailed enough and do not provide information about the appropriateness of a given technique for a given data type. Based on the work of Bertin [16] who analyzed the relationship between data characteristics, graphic variables, and human perception, Card and Mackinlay presented a semiology [17] of information visualization techniques based on the similarities of the types of data presented, as well as automatic and controlled processing of the data’s graphical properties. Their work provided insight into the focus of different visualization techniques as well as a method with which to organize, compare, and understand the differences of particular visualization methods. Keim [18] proposed a taxonomy based on three factors: data type, display mode, and interaction and distortion. This orthogonality which suggests that any data type could be used with any display mode and any interaction technique is problematic as some displays and tasks are more appropriate for a given data type. The previous works that fall within our particular domain are the taxonomies presented by Milgram and Colquhoun [5], who proposed a taxonomy for real and virtual worlds, and Dubois et al. [19], [20], who specifically looked at mixed reality systems for computer-aided surgery. Dubois’ taxonomy (OPaS), focuses on classifying entire augmented reality systems using an interaction centered approach. OPaS is based on four factors: Object (the data), Person (the user), Adapter (the bridge between real and virtual worlds, i.e., a mouse), and System (the
VOL. 18, NO. 2,
FEBRUARY 2012
computer component). Dubois et al. extended OPaS with the Agent System User Real (ASUR) notation [20], [21], meant specifically for task-centric AR system descriptions. ASUR enables analysis of the information flow and interaction between a system and end user in the real world, focusing on the real and virtual information involved in the user’s task to be executed. Milgram specifically focused on mixed visualization in terms of the level of modeling of the world the user interacts with, how immersed the user is, and what the user’s viewpoint is. In other words, Milgram suggested that each type of visualization lies upon one of three continua: reality to virtuality, egocentricity to exocentricity and incoherence to coherence. Neither the ASUR notation nor Milgram’s classification is meant for describing the specifics of the user interface or the visually processed data presented to the end user. Finally, Chi [22] described an information visualization Data State Reference (DSR) model based on four data stages (value, analytical abstraction, visualization abstraction, and view) and transformation operators (data transformation, visualization transformation, and visualization mapping transformation ) that map data between stages. The process model also includes within stage operators, which are processing steps that happen at a given stage but do not change underlying data structures. Our interpretation of Chi’s DSR model with application to IGS is shown in Fig. 2. For IGS, the process model begins with raw data, acquired from a sensor (e.g., MRI, fMRI, CT, US, etc.). The raw data undergo a data transformation resulting in analyzed data. The visualization processing step transforms the analyzed data into a visual representation of the data and the visual mapping transformation renders the visually processed data to a graphical view, i.e., the end product of the visualization. The view encompasses where and on what device the user sees the visualization, how the user interacts with the visualization and how he or she interprets or perceives it. A specific example of the DSR model applied to an AV system for image-guided neurosurgery [9] is given in Fig. 3. Chi’s data flow model facilitates the comparison of different visualization techniques by emphasizing the operators used to transform the data into different representations. However, the classification of visualization technique seems not only to be based on data type and display mode but is also confounded with the applied field or domain. The purpose of our work is to develop a taxonomy to classify techniques and systems within the applied domain of IGS. Numerous taxonomies have been proposed for information visualization and also specifically for mixed reality visualization, however, most were created over 10 years ago and due to the evolution of mixed reality systems, many may no longer be relevant, especially in the unique domain of IGS. We have designed a more complete taxonomy based on the most pertinent factors described in existing visualization taxonomies, as well as on current IGS surgery systems. The proposed DVV taxonomy is based on three factors: 1) Data, and in particular two data superclasses, as well as a number of data subclasses that are specific to IGS, 2) Visualization processing, and 3) View. In the next section, we describe the three factors of the DVV taxonomy, the classes and subclasses that represent them, and their attributes.
KERSTEN-OERTEL ET AL.: DVV: A TAXONOMY FOR MIXED REALITY VISUALIZATION IN IMAGE GUIDED SURGERY
Fig. 2. On the left the visualization process is depicted, this includes four Data Stages (rectangles) and three Data Transformation or processing stages (hexagons). On the right, examples of the Data Stages or Transformation operators for visualization in image guided surgery are given.
3
TAXONOMY FOR MIXED VISUALIZATION IN IGS
The DVV taxonomy was developed based on a review of both past and current mixed reality IGS systems. It also builds on our practical experience working with and developing IGS solutions. With the help of domain experts in both IGS and software engineering, we iteratively refined a taxonomy that would take into account the most relevant factors for the development, and validation and evaluation of such a system. The terminology used within our taxonomy is based on a formal literary review which was done based on a database of publications in the area of mixed reality visualization systems for IGS. The database was created using the following search in pubmed (http:// www.ncbi.nlm.nih.gov/PubMed): (surg*[Title] OR medical[Title] OR image-guided[Title] or intervention*[Title] OR operat*[Title] OR AR[Title] OR VR[Title]) AND (reality[Title] OR mixed[Title] OR
335
Fig. 3. An example of the visualization process in an augmented virtuality system for image guided neurosurgery. The raw data are the 2D microscope image of the visible brain surface and anatomical and functional data resulting from various MR protocols. The analytical abstraction of the data results from the processing of the MR images to acquire surface objects of the tumor, sulci, and eloquent areas. In the visualization processing step, transparency is used to overlay the real microscope image onto the virtual brain surface objects. The visually processed data are then projected onto a monitor in the OR and used by surgeons for navigation.
virtual[Title] OR augment*[Title] OR visualization [Title]) AND (visual* OR stereoscopic OR stereo OR projectionOR vision OR head-mounted OR perception OR enhanced) NOT educat*[Title]) NOT training[Title] NOT learning[Title] NOT simulat*[Title] NOT(breast augment*[Title]) NOT TV[Title] NOT(virtual surgery [Title]) NOT(surgical planning system[Title]) The search returned 467 results, of these 58 met our inclusion criteria as follows: The publications that were
336
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS,
VOL. 18, NO. 2,
FEBRUARY 2012
Fig. 4. The above figure shows the three factors of our visualization taxonomy (i.e., data, visualization processing and view), as well as the classes and subclasses (solid-line arrows) that represent them. The relationships between them (dashed-line arrows) are also shown. Numbers in the figure specify the cardinality of the relationships. The surgical scenario is associated with both visually processed data and view classes. The view is the component that is interacted with by the user and therefore, the component which is most limited by constraints.
included described an entire visualization system for use in the OR, even if the focus was only on one aspect of the system, e.g., only on visualization processing or only on the display technology used. Further, the publications described a system for use in the OR rather than a simulator for diagnosis, planning or training. A number of results were excluded: 286 results were rejected as they were not relevant to the domain, 66 because they described planning systems rather than OR systems, 40 because they described only one aspect of a system or were review papers, and 17 more that may have been relevant were rejected because they were written in a foreign language. Based on our review and our prior knowledge of the field, we believe that in order to provide surgeons with a tool to effectively assist them during surgery, three main points should be considered: 1) which of the abundance of available data should be used, 2) how it can be effectively merged and visualized, and 3) how it should best be displayed and interacted with. We therefore, classify a mixed reality IGS system based on three factors: data, visualization processing, and view. The first factor that is used in the DVV taxonomy is data. It includes patient and visually processed data classes with the properties of: dimensionality, modality, semantic, whether the data are acquired preoperatively or intraoperatively, and whether it represents a real or virtual object in the end view of the visualization process. Such classes are used to classify different visualization systems. Second, the visualization processing or algorithms that are used to transform the data to create a pictorial or visual representation of the data are used for classification. Lastly, we classify based on the view which is the end product of the visualization and encompasses the display that is used, the perception location where data are introduced, and the interface. These three components, their subclasses and their relations are depicted in Fig. 4. The components and
classes with their most inherent attributes and the values that they can take on are given in Table 1. In addition to the three factors of our taxonomy, we use the notion of a surgical scenario, which is associated with both data and view classes, to allow us to describe dynamic systems that change based on the end-users current-task. The surgical scenario allows us to 1) determine what type of visualization data should be shown at a particular point in time of the surgery, 2) where it should be viewed and 3) how the data may be interacted with at that step in the surgery. The surgical scenario describes the type of surgery and in particular the number of surgical steps that are executed to perform the surgery. Each surgical step describes the action to be done, its associated precision and completion time. By associating an instance of visually processed data with each step we ensure that the system adapts to the needs of the surgeon as he performs different tasks. The surgical scenario classes and their associations are shown in Fig. 5. An ontology, such as the one suggested by Jannin and Morandi [23] and Neumuth et al. [24], may be used for a detailed representation of both the surgical scenario and steps. In the remainder of this section, we define each of the three factors, data, visualization processing and view, as well as their subclasses and give justification for why each factor is used for classification. We provide diagrams of the classes and subclasses which make-up each of the factors. References and examples of the taxonomy components and classes are left to Section 4.
3.1 Data We consider two main subclasses of data: patient specific data and visually processed data (with analyzed, derived, and prior knowledge subclasses). In general, the different subclasses of data may be directly viewed or may undergo one or more transformations to become visually processed data. The classes, their subclasses, their most important
KERSTEN-OERTEL ET AL.: DVV: A TAXONOMY FOR MIXED REALITY VISUALIZATION IN IMAGE GUIDED SURGERY
337
TABLE 1 Main Attributes & Common Instances of the DVV Components
In the above table common instances of three factors of our visualization taxonomy and the classes that represent them are given
attributes and example instances are depicted in Fig. 6 and are described in the following sections. Detailed data classes and subclasses are used within the taxonomy for easy specification of all types of data that may be presented to the end user. The purpose of using data as part of our classification is to highlight the importance of deciding which of the available data should be shown to the end user. As the number of possible data sets that can be acquired increases, there is a stronger demand for integration of the different modalities and numerous visually processed data sets into a coherent visualization. For example, numerous data sets may be gathered in neurosurgery, including both structural data (MRI, CT, CTA, MRA, DTI) and functional data (EEG, MEG, PET, fMRI), and used for diagnosis and treatment
Fig. 5. The surgical scenario describes the type of surgery and in particular the number of surgical steps that are executed to perform the surgery. Each surgical step describes the action to be done, its associate precision, as well as its’ completion time. Each surgical step is associated with visually processed data and therefore indirectly with the view.
planning. The problem of which data and information should be used, and how data should be visually integrated to give a useful representation for a given surgical task is nontrivial. Showing too much or all of the available data is not always desirable as it may confound the viewer and makes it difficult to highlight the most important features. Furthermore, new imaging modalities or even the fusion of existing modalities may require retraining of surgeons. Therefore, it is important that careful consideration be given as to what type of data should be visualized and at what point during the surgery it is useful. The latter is taken care of by considering the surgical scenario.
3.1.1 Patient Specific Data Patient specific data may include demographics, clinical scores, or signal, and raw imaging data. Clinical scores are measured from the patient and used for diagnosis and treatment. Examples include, body mass index (BMI), tumor volume, heart rate, oxygen levels, allowable blood loss, respiratory rate, etc. The attributes of imaging data are the dimensionality, the acquisition sensor, and whether the data are preoperative or intraoperative. Raw imaging data are the direct output of the acquisition system. Its attributes include spatial resolution, whether and which contrast agent was used, and slice thickness. Raw imaging data are transformed into analyzed imaging data in accordance to the visualization processing pipeline (see Fig. 2), and only then visualized.
338
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS,
VOL. 18, NO. 2,
FEBRUARY 2012
Fig. 6. The data classes, their most relevant attributes, and example instances are depicted here. The arrows represent inheritance and the gray rounded squares represent example instances of a class.
3.1.2 Visually Processed Data Visually processed data are presented to the end user by means of the view. Visually processed data may be analyzed imaging data, derived data, or prior knowledge data. It is shown within a particular surgical scenario and more specifically for a specific surgical step. Different surgical steps may use a different representation of a given instance of visualization data or a different view. The visually processed data therefore should be adapted to the surgical step; it should show the end user the most relevant data in its most informative pictorial representation at each point during the surgery. The characteristics of visually processed data include dimensionality, semantic, and whether the data are a real or virtual object. Visually processed data have a notion of semantics based on the current surgical step; it may have a meaning that is strategic, operational, or anatomical. Data that have a strategic semantic deal with planning and guidance. In surgical navigation, representations of virtual tools and in particular their planned paths, are an example of data with a strategic semantic. Data with an operational semantic deal with actions or tasks, for example, color can be used to indicate different states, more specifically a tool may change to be red as it approaches a high risk area. The most common semantic is anatomical, dealing with the anatomy, physiology, and pathology of the patient. In AR preoperative and intraoperative models of the patient anatomy, such as tumors and organs, are projected onto the patient giving the surgeon “X-ray vision” beyond the exposed anatomical surface. In AV, intraoperative images from sensors in the OR may be used to enhance preoperative models. Visually processed data may be real or virtual. In AR, data that we add to the real environment are a virtual object. In many AR IGS systems, anatomical data will be projected onto a patient in order to show the target anatomy under the visible surface of the patient. In AV the data that are added to the virtual environment or scene are real data. In AV IGS systems, live images captured from sensors in the OR are used to enhance preoperative models of the patient.
It should be noted that the dimensionality of the visually processed data may be lower than that of its subclasses (analyzed imaging data, derived data, and prior knowledge data). A 3D object may be displayed in the 3D world by projecting the object in 3D onto the patient. However, more commonly, a 3D object will be visualized in less than three dimensions, for example, it could be displayed on a flat monitor or as 2D contours. Analyzed imaging data. Analyzed imaging data raw imaging data which have undergone a transformation to create a specific data object. The main attribute of analyzed imaging data, is the underlying data primitive. Instances of analyzed imaging data primitives in IGS are: point, line, plane, contour, surface, wireframe, and volume. As an example, whereas raw data would be the direct output of the magnetic resonance scanner, the corresponding analyzed imaging data could be the slices rendered as a volume. Prior knowledge data. Prior knowledge data are derived from generic models. Instances of prior knowledge data include: atlases, surgery road maps, prior measurements, tool models, and uncertainty information about the IGS system that is patient specific. Derived data. Derived data are obtained from processing either only patient specific data, for example, uncertainty due to the calibration, registration, and/or segmentation process, or patient specific data and prior knowledge, for example, brain regions may be segmented using an atlas. Instances of derived data include: labels, uncertainty information specific to the patient and/or the type of surgery, and measurements such as tumor volume and distances between regions of interest.
3.2 Visualization Processing The visualization processing component of the taxonomy represents the specific visualization techniques or transformations on the data that are used to provide the best pictorial representation of the data for a particular task at a given surgical step. Numerous algorithms and techniques have been developed for visualizing medical data:
KERSTEN-OERTEL ET AL.: DVV: A TAXONOMY FOR MIXED REALITY VISUALIZATION IN IMAGE GUIDED SURGERY
339
Fig. 7. The main classes of the view, their subclasses, and their most relevant attributes are depicted. The arrow heads represent inheritance and the triangle arrow heads represent aggregation (i.e., “has a” relationships). Gray rounded squares represent example instances of a class.
nonphotorealistic rendering (NPR) and illustrative techniques, photorealistic rendering, importance-driven volume rendering, color coding, using transparency, adding depth cues, and using saliency methods (such as highlighting regions of interest). By choosing an appropriate visualization technique it is possible to increase the diagnostic value of the original (or imaging) data. For this reason, we use visualization processing as the second component for classification in the DVV taxonomy.
3.3 View The view of the system is the end product of the visualization process, and therefore the part of the system with which the user interacts. It has three major components: the display, the perception location, and the interaction tools (see Fig. 7). The particular combination of display, perception location, and interaction tools determines where on the reality-virtuality continuum the system lies. The view of the system is the factor that is most limited by the constraints of IGS, and therefore may require solutions specific to the domain. The display and perception location are limited by the physical constraints of the OR which include the need for sterile equipment, available space, and ensuring freedom of movement of the surgeon. The interaction tools, which are part of the user interface, take into account the constraints of the end user which include the cognitive load of the user, the reaction of the user to system errors, the user’s workflow, and amount of relevant information presented to the user at a given point in time. The view is also associated with the surgical scenario; each of the visually processed data for a given surgical step is presented at a particular perception location, on a particular display, and may have only a subset of possible interactions associated with it. In addition, the constraints of the surgical scenario, including the precision of the surgical task and the cost/benefit ratio of the procedure when using the system, must also be considered. The view is used to classify mixed reality IGS systems as it is perhaps the most relevant component to the end user. It enables a system description in terms of where and how
information is presented to the end user and how the end user can interact with the system.
3.3.1 Perception Location The perception location is the area or part of the environment where we focus our attention in order to benefit from the mixed visualization. The perception location may be the patient, a digital display, a surgical tool, or the real environment. Depending on where the visualization system lies on the reality-virtuality continuum, the perception location may be either a real or virtual environment. The perception location falls into one of two categories, those that require the surgeon to look away from the surgical field of view, and those where this is not required. In order to determine the perception location, three things should be considered: 1) whether it is desirable for the surgeon to look away from the surgical field of view (if not, a head-mounted display (HMD), hologram or silvered mirror might be the best option); 2) whether other people in the OR may benefit from the visualization, in which case an external monitor that can be viewed from different locations in the OR could be used; and 3) whether there are locations in the OR or on the patient which should not be occluded by virtual objects. For example, occluding parts of the anatomical surface of the patient with virtual objects may not be feasible for certain surgical tasks. In general, the choice of perception location should allow for a seamless and smooth integration of real and virtual information. 3.3.2 Display The display is the particular technology that is used to present data to the end user. Attributes of the display include the field of view of the display, the resolution, and the size of the display. A number of possible display technologies exist. Here we consider only those displays that have been used in IGS. Displays fall into one of two classes, those in which images are projected onto a 2D technology, for example, a computer monitor or a touchscreen display, or those which provide a 3D impression of the scene or object, for example, a HMD. Those that provide 3D impressions fall into one of two categories: 1) binocular stereoscopic systems, which require
340
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS,
the use of special head gear or glasses, and 2) autostereoscopic displays, which do not [25]. Examples of binocular stereoscopic displays include HMDs, polarized glasses, and shutter glasses. Autostereoscopic displays, which have been used in IGS, include multiview lenticular displays, videography, and holography. The surgical microscope, which allows for both 2D and 3D data representation, has also been used to fuse virtual objects with the real scene in IGS.
3.3.3 Interaction Tools The interaction tools are part of the user interface, which deals with the design, implementation, and evaluation of computer systems used by humans. We suggest two major subclasses of interaction tools: hardware interaction tools which are the physical devices that a user employs in order to manipulate data and virtual interaction tools which allow the user to manipulate the pose, and view of the data as well as the visualization parameters of the data. Examples of hardware devices for interaction used in IGS include a mouse, a keyboard, a representative tangible object (e.g., a surgical tool), a haptic device, or a data-glove. Instances of virtual interaction tools include transfer functions, volume cutting, voxel peeling, clipping planes, turning data visibility on and off and, in general, adjusting data properties such as color, brightness, contrast, and transparency.
4
EXAMPLE INSTANTIATION OF THE DVV TAXONOMY
In order to validate our taxonomy we studied state-of-theart publications that described mixed reality visualization systems for IGS. For the instantiation, we created a second database based on the following Google scholar search: allintitle: reality CAS OR ’’computer assisted’’ OR medical OR surgery OR surgical OR intraoperative OR ’’intra-operative’’ OR neurosurgery OR operating -planning -education -simulation -simulations -simulator -simulators -trainer -training -myth -plastic Search only in: Biology, Life Sciences, and Environmental Science Engineering, Computer Science, and Mathematics Medicine, Pharmacology, and Veterinary Science The search returned 416 results and 109 publications met our inclusion criteria as described in Section 3. In order to get an overview of the current work that is done in the area, we chose the research groups or authors that had two or more publications in the database since the year 2000. From the research groups and authors, we chose the most recent publication. However, if that publication did not describe the developed mixed reality system in sufficient detail, we chose the next most recent publication. Furthermore, if a particular research group had significantly different approaches to developing a system, we included each of the different publications. For example, this was the case with
VOL. 18, NO. 2,
FEBRUARY 2012
the group from the University of Karlsruhe, who looked at both projector-based [26] and stereo microscope-based [27] mixed reality IGS systems. The selection criteria resulted in 17 publications. The selected papers are summarized in Table 2, where the taxonomy components are used as columns. Below, we look at each of the components of the taxonomy and discuss the solutions presented in the selected publications.
4.1 Data All but one of the papers described the raw imaging data that was used. All of the papers described the analyzed imaging object data type that was used and how the analyzed data were visualized. Very few papers, however, mentioned the use of derived or prior knowledge data. 4.1.1 Patient Specific Data All but one paper [26] specified the sensor by which the data were acquired or the modality of the data (raw data) that were visualized. As well, all of the papers specified the analyzed data object type that was used. In the majority of papers, either surfaces, extracted contours, or wireframe representations of the modality data were used. In the case of Linte et al.’s work [28] which used intraoperative ultrasound data, planes representing 2D slices were used. Slice views (planes), which are traditionally used for medical image viewing, were also used in two systems [29], [30]. In the work of Glossop et al. [31] where a laser projector was used and in the work of Kahrs et al. [26] where multiple projectors were used, the type of data that were visualized was limited to be points, lines and/or contours. The use of simple representations such as planes, surfaces, contours and wireframes, allows for high frame rates and update frequencies. Although the use of volume data may be desirable, it may not always allow for realtime rendering for intraoperative surgical decision making and image guidance. Volume data were used in five of the systems [32], [33], [34], [35], [36]. Of these, two groups reported system speeds. Scheuering et al. [34] and Konishi et al. [32], both injected volume images into endoscopic real-time video but reported very different frame rates. The former achieved 25 fps, and the latter reported rates of 3-4 fps and high latency. In the system developed by Simitopoulos and Kosaka [29], surfaces and orthogonal cutting planes of preoperative images were injected into real-time video, and a rate of only 6 fps rates was reported. In Suzuki et al.’s [35] endoscopic surgical robot system, which uses both surfaces and volumes, frame rates of 510 fps were measured during an animal experiment. The hardware being used, video capture rates, and what type of visualization processing is done will all contribute to the speed of the system. However, as shown by Scheuering et al. [34] it is possible to use volume rendering visualization techniques and as the speed of hardware increases the use of volume data should become more common. 4.1.2 Visually Processed Data Little attention was given to how data should be presented to the user in order to provide the most informative representation of the data and to enable a good spatial and
KERSTEN-OERTEL ET AL.: DVV: A TAXONOMY FOR MIXED REALITY VISUALIZATION IN IMAGE GUIDED SURGERY
341
TABLE 2 DVV Classification of Mixed Reality IGS Publications
structural understanding of the data. The majority of the visually processed data described in the publications were color coded surfaces of the regions of interest [37], [38], [39], [28], [34], [29], [40], [35], or transparent data [37], [28], [34], [40], [35], [30], [36]. The visually processed data sometimes had a lower dimensionality than raw data. In several works [37], [31], [26], [29] only points, contours, or 2D planes were extracted from the 3D raw (preoperative) images and presented to the user.
In the majority of the papers, data with an anatomical semantic were used. Preoperative or intraoperative models of the patient’s anatomy of interest were used to extend the surgeon’s visual field of view of the patient. Data with an operational semantic (dealing with tasks) were visualized in three works [41], [38], [40] to depict the proximity of a tool to a target. In addition, in one system [30] data with a strategic semantic (dealing with planning or guidance) were visualized by choosing different colors to
342
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS,
differentiate between the actual biopsy needle and the extrapolated path. Only six systems ([38], [42], [39], [28], [30], [36]) described the visualization of prior knowledge data, and this only in terms of tool depiction. Tools such as an endoscope [38], marker pins [42], Raman probes [39], catheter [28], biopsy needle [30], and surgical drill [36] were localized and depicted on screen to guide the surgeon. The other systems described in the publications did not specify whether prior knowledge data were used, nor how they were rendered on screen. Six of the systems in the 17 publications also described the visualization of derived data. Three systems visualized the distance of a tool to the target anatomy. Kawamata et al. [38] showed the distance using a bar graph, Soler et al. [40] show the distance numerically, and Birkfellner et al. [41] change the representation of the target object from solid to wireframe based on the proximity of the tool. Other types of derived data were also visualized in the different systems. In the work by King et al. [39] colored markers which represent different tissue classes (based on information from Raman spectroscopy) were shown. Uncertainty information in terms of the target registration error (TRE) was represented graphically in the work of Linte et al. [28] using a 95 percent confidence ellipsoid. Suzuki et al. who described a surgical robot with augmented reality functions in their work, visually displayed haptic sense information from the right and left robot arms, the forceps indicator, the location of robot tip, and the patient’s vital signs [35].
4.2 Visualization Processing In the selected publications, visualization processing dealt mostly with the visualization of anatomical data. For the most part, the processing was limited to color coding structures [37], [38], [39], [28], [34], [29], [40], [42], [35], using saliency methods to add emphasis by outlining regions of interest [40], [36], using transparency to combine modality data [37], [28], [32], [34], [40], [35], [30], [36], using lighting and shading [42], [34], [36], and using depth cues such as stereo [27], [41], [42] and occlusion [29], [36] to enhance perception. In two of the papers [32], [33], the visual processing of the data was not mentioned, and in two others where projectors were used [31], [26] visualization processing was not applicable. 4.3 View In all of the papers the perception location and display were specified. The interaction component, however, was mentioned less often, and in the publications where it was mentioned the discussion was limited. 4.3.1 Perception Location In eight of the papers, [42] the perception location was the patient [27], [41], [31], [26], [33], [30], [36], in seven it was an external digital monitor [37], [38], [39], [40], [34], [29], [32], [35], and in one the perception location could be either the patient or a monitor [28]. For laparoscopic or endoscopic surgery, a monitor was always used as this is already present in the OR and traditionally used for this type of surgery. However, for other types of surgery the patient may be preferable as this allows the surgeon or resident to keep his focus on the surgical field of view. This suggests
VOL. 18, NO. 2,
FEBRUARY 2012
that the perception location might be linked closely with the type of surgery that is being done.
4.3.2 Display In half of the systems [32] digital computer monitors were used [37], [38], [39], [40], [32], [34], [29], [35]. Advantages of monitors include the ability for multiple users to benefit from the visualization, and the fact that monitors are already available in the OR. The latter implies that there is no need to introduce a new display device making it both cost effective and nonintrusive. In the systems where the patient was used as perception location, the display device varied. In two publications, one [31] or more laser projectors were used [26]. In the work by Mischkowsk et al., a new display device based on a camera and a portable LCD screen (X-Scope) [33] was used and tested. Using this device the user can walk around and see the internal anatomy of the patient from different view points. In two of the selected publications [30], [36], HMDs were used as display device and in three a head mounted operating microscope was used [27], [41], [42]. HMDs can be either optical see-through, where half-transparent mirrors are used to reflect computer-generated images on top of the real world to the user, or video see-through, where video images of the real world are captured using two video cameras attached to the head gear and are combined with computer generated images to be viewed by the user. The HMDs that were used in these publications were video see-through [30], [36]. Video see-through systems guarantee registration accuracy between real and virtual scenes, however, this is at the cost of a potential mismatch between visual and proprioceptive information [43]. The main advantage of optical see-through systems, which were used in the head mounted operating microscope systems [27], [41], [42], is that the real environment is not occluded, ensuring synchronization between vision and proprioception [43]. For a detailed comparison and analysis of the advantages and disadvantages of optical and video see-through HMDs, the reader is directed to the work of Rolland and Fuchs [43]. In three of the chosen publications a surgical microscope or variation thereof was used. The use of a device that is already present and used in the OR environment seems a logical solution for a visualization system as it may reduce the disruption of the workflow of the surgeon and seamlessly fit into the infrastructure of a surgical navigation system. In fact, the two most popular IGS systems that are available commercially for neurosurgery—Brainlab and Medtronic—use the microscope. In the selected publications, the MAGI (microscope-assisted guided interventions) [42] system, the Varioscope AR [41], and the system developed by Aschke et al. [27] enable correct stereoscopic visualization of the virtual features presented in the optical path of the microscope. Unlike other works where 2D contours of tumors or eloquent cortex are projected onto the focal plane of the microscope and therefore, not seen at the correct depth ( [44], [45]), the solutions presented in MAGI, Varioscope AR, and by Aschke et al. [27] enable correct 3D stereo perception of the overlaid structures. In the first, this was achieved by displaying an offset stereogram image of the virtual information into each of the microscope oculars;
KERSTEN-OERTEL ET AL.: DVV: A TAXONOMY FOR MIXED REALITY VISUALIZATION IN IMAGE GUIDED SURGERY
in the second, by means of two miniature VGA displays, one meant for each eye, and in the third using micro displays.
4.3.3 Interaction Tools As mentioned above, the scope of the interaction component in our taxonomy is limited to hardware interaction tools and data manipulation via virtual interaction tools. Other components that fall under interaction include usability of the system, the learning curve, the user’s situation awareness, the user’s satisfaction, and the perception of the system over time. These are rarely described in works in the area of mixed reality visualization systems for IGS. Even the hardware interface and virtual interaction tool components of the DVV taxonomy are rarely mentioned or analyzed in this domain. In the selected publications, only one paper described the use of a mouse to interact with the system and fewer than half of the papers had any mention of how the user could interact and change the viewing and/or visualization parameters of the data on screen. The discussion of the virtual interaction component in the selected papers with few exceptions was limited to: allowing the user to rotate to get a new view of the data, navigate through the scene, and change the position of the virtual camera [41], [32], [28], [33], [34], [29], [40], [35], as well as, allowing the user to toggle components on and off, and change opacity or color or objects [40]. A novel interaction component was used in Wieczorek et al.’s work [36], where a virtual mirror could be extended into the patient to provide a view of what is behind an occluding object. In two of the papers, the screen interface of the system was described. Konishi et al. [32] used a four window view, showing the oblique endoscopy, the virtual endoscopy, the 3D US on endoscopy (AR), and the real endoscopy. In Simitopoulos and Kosaka’s [29] system, a three window view is used, with one main window and two subwindows showing the orthogonal slice views. In terms of the hardware interface component and interaction tools, a number of works described the hardware interface that could be used to change the view. In Konishi et al. [32] the viewing direction could be moved by rotating a scope cylinder, even if its camera head was unmoved. To move or rotate the camera in the work of Simitopoulos and Kosaka [29], the user moves and rotates a MicroScribe5 (a pen-like 3D digitizing system). The X-Scope system described in Mischkowski et al. [33] is a portable hand held display that can be easily moved around the patient to access different views. Interaction with particular tools was seen in King et al.’s work [39] where a Raman probe was used for scanning tissues and in Suzuki et al. [35] where the robotic arms used in surgery return haptic feedback. 4.4 Surgical Application In all but three papers [39], [29], [36] the type of surgery that the system was meant to be used for was specified. Even the three papers where the particular surgical domain was not specified, the type of raw data that the system required would limit the type of surgeries that could be done using the system. For example, both Simitopoulos and Kosaka [29], and Wieczorek et al. [36] mention the use of real-time 5. http://www.3d-microscribe.com.
343
video suggesting endoscopic or laparoscopic surgery. King et al. [39] use Raman spectroscopy to classify tissues, therefore, it seems tumor resections would most benefit from this type of system. In the selected publications there was no work done on developing a general purpose medical mixed reality system. This is in part due to the fact that the type of surgery, the steps involved in the surgery and how they are executed, i.e., the workflow, and the required accuracy and amount of time each surgical step takes will drive what solutions can or should be used for each component. For example, as we can see from the selected publications the current use of a monitor for endoscopic or laparoscopic surgery has led developers to choose to augment the real-time video images from the endoscope rather than augment reality by projecting onto the patient. Each surgery will have its own workflow, required outcome and accuracy, and these will drive the type of mixed visualization that is used.
4.5 Validation and Evaluation In determining the most relevant factors of a mixed reality visualization system, we also determined the components which must be evaluated and validated to demonstrate whether there is added value in using the system. We differentiate between validation and evaluation in the following way: validation proves the accuracy of a system and shows that the right system was built, and evaluation is concerned with showing the usefulness and value of the system. In the following section, we look at which components, described in the selected publications, were validated or evaluated. Table 3 summarizes these validations and evaluations. All of the papers described some sort of validation or evaluation of their systems. In many papers, the focus of validation was on overall system accuracy [37], [31], [26], [39], [33], [29], [40]. A number of works also looked at registration accuracy [38], [42], [34], [40], [35], [30] and system speed [32], [34], [29], [30], [36]. In a few papers, the visualization component of the system was validated. Birkfellner et al. [41] compared stereoscopic vision, to monoscopic vision both with and without proximity cues and found that target localization was quicker and more often successful when stereo was used and when proximity cues were used. In the MAGI system [42], the chrominance, luminance, and spatial frequency of the image that was overlaid into the occulars of the microscope were studied. The results showed that by using textures with locally unique spatial frequencies and by controlling illumination or by using wireframe representations of virtual objects with different texture mappings, depth perception could be enhanced. Linte et al. [28] looked at how analyzed data should be visualized and found that using two orthogonal views provided better perception of depth for guidance compared to other visualization methods studied. Two papers [27], [28] reported examining the use of displays. In Aschke et al. [27], the use of different display hardware, mini beamers, liquid crystal display panels, and micro displays, was studied for incorporation into their augmented microscope. The resolution, brightness, contrast, color depth, and refresh rate were examined for each type of display and based on their results the authors chose
344
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS,
VOL. 18, NO. 2,
FEBRUARY 2012
TABLE 3 Validation and/or Evaluation of Mixed Reality IGS Systems
to use micro displays with a resolution of 1,024 by 768 pixels. Linte et al. [28] reported that most of their users found HMD displays more intuitive than external monitors. Five papers [32], [28], [33], [35], [30] reported on the use of the system as a whole. Konishi et al. [32] used their system on 20 cases and five different types of organ surgery.
Qualitative results showed that the surgeons understood the spatial relationships between anatomical structures better, and found the views of occluded objects useful. Linte et al. [46] compared the use of US guidance alone and traditional endoscopy to their VR-enhanced US guidance system on a beating heart phantom and found that their
KERSTEN-OERTEL ET AL.: DVV: A TAXONOMY FOR MIXED REALITY VISUALIZATION IN IMAGE GUIDED SURGERY
345
Fig. 8. In the above graph, the percentage of components or classes which were described in the selected publications is shown. For example, 88 percent of the papers described the type of analyzed imaging data that are used in their visualization system.
method was significantly better than US guidance alone and was not significantly worse than endoscopy. The X-Scope [33] was used on five patients undergoing bimaxillary orthognathic surgery. Although surgery time was significantly prolonged, all five surgeries were a success. Lastly, Suzuki et al. [35] provided qualitative results on the usefulness of their system based on an animal study. In the selected publications, a number of components of the systems were validated, however, few systems were evaluated. Most systems were only validated using numerical methods and/or using phantoms. They were not evaluated in clinical settings with patients. The visualization processing, view or the user interface component, which would require user studies for evaluation, were not evaluated in any of the publications. However, unlike imaging modalities such as US and MRI which provide direct impact on a system by providing new information, mixed reality visualizations only provide indirect assistance. For such indirect assistance, nothing is obvious and therefore validation and evaluation need to be done to prove whether such assistance is valuable. In less than one third of the publications was any assessment of system as a whole estimated. The use of mixed reality visualization as a standard tool has not yet been presented, and clinical studies showing the integration of mixed reality into the clinical environment and workflow are rare [3]. This is in part due to the complexity of IGS mixed reality systems which make the task of evaluation and validation difficult. It is often not feasible to evaluate a system based on surgical outcome or the impact of the system on the patient. It is possible, however, to begin to indirectly evaluate a system by taking into account the surgical scenario and in particular how the visually processed data used at each surgical step affects the completion time and the precision with which each step is accomplished. Such evaluation is needed to prove the added value of using particular components and a particular system and in turn to ensure the use of the system in the OR.
5
TAXONOMY VERIFICATION
The task of evaluating an ontology (or taxonomy), although very important, is not trivial. Both qualitative and quantitative methods have been suggested for evaluating ontologies, including: having domain experts rate the ontology based on particular criteria (e.g., consistency, completeness, conciseness, expandability, and sensitiveness [47]), using
data-driven approaches [48], or using particular coding schemes to encode the ontology (e.g., concepts may have an essence, identity, unity, and dependence [49]). Brewster et al. [48] suggested that an ontology can be evaluated using a “goodness of fit” measure. In this statistical approach, an ontology is compared to a corpus (or a set of natural language texts) rather than to other ontologies. This can be done by, for example, counting the number of terms that overlap between the ontology and the corpus, and penalizing the ontology both for terms that are absent in the ontology but are present in the corpus and for terms present in the ontology and absent in the corpus [48].
5.1 Goodness of Fit To evaluate the DVV taxonomy, we looked at the overlap of terms between the taxonomy and the literature. Specifically, we examined the overlap of terms between the instances that a component or class in our taxonomy can take on and the terms found within the chosen publications (Table 1). It was not possible to fully automate this method because terms in our taxonomy are most often context dependent. For example, the analyzed imaging data attribute primitive may take on the value “volume,” in other words, analyzed imaging data may be a 3D volumetric data set. However, the term “volume,” can be used in different contexts that do not have to do with volumetric data as a class of derived data, i.e., “volume of interest” or “tumor volume.” The term overlap between our taxonomy and the corpus therefore, is based on the previous analysis of the 17 publications (Table 2). The percentage of papers which described the classes that represent each factor within the taxonomy is shown in Fig. 8. According to Brewster et al. [48], the proposed taxonomy could be penalized for failing to find terms related to prior knowledge data, derived data, and interaction. However, we believe that the lack of this information in the selected publications is indicative of what current and future research should focus on and therefore believe these components and subclasses to be integral parts of the taxonomy. The taxonomy should only be penalized if particular components are not applicable to IGS visualization systems. This was found to be the case for only two systems where a laser display was used [31], [26]. Since only vector objects could be visualized, it is not possible to display prior knowledge or derived data. In all other cases, the unspecified components could be integrated into the IGS visualization system. Furthermore, each factor of the component was specified in at least one publication. In the discussion (Section 6), we
346
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS,
VOL. 18, NO. 2,
FEBRUARY 2012
TABLE 4 Term Frequency in the 17 Chosen Publications
In the above table, the 30 most frequent terms and the number of times they appeared in the 17 chosen applications (frequency) are shown.
provide examples for the literature to demonstrate that these components are significant to IGS and therefore should be factors in the DVV taxonomy.
The lack of other common terms in the corpus leads us to believe that the DVV taxonomy is complete in its ability to describe IGS systems.
5.2 Completeness To ensure the completeness of our taxonomy, the most frequent terms in the chosen publications were found. We then checked whether the terms and concepts they described were taken into account by the DVV taxonomy. If more than one publication described a given system, we chose only the most recent publication from the selected publications so as not to skew the results. Frequent terms were found using the concordance tool in DEVONthink Pro Office Version 2.0 (public beta 8). The first 30 terms (omitting articles, pronouns, and prepositions) in order of there frequency are shown in Table 4. One noticeable pattern was the frequent use of terms such as registration, accuracy and error. The evaluation of IGS systems and in particular mixed reality systems in terms of calibration, registration, accuracy, target localization accuracy, etc., is necessary for the acceptance and use of the systems in the OR. It is therefore not surprising that these terms were frequent. However, one purpose of the DVV taxonomy is to define all of the components which should be evaluated and not specific evaluations. Terms describing the type of surgery, type of anatomy, or task at hand (e.g., catheter, needle, etc.) were also common among the papers. The common use of terms dealing with the type of surgery and the surgical task may suggest that classification should be done according to type of surgery. In fact, a number of proposed taxonomies have taken into account tasks to be a main element for classification [14], [15], [17], [18], although we believe that the end-user’s task-to-be-completed is an important component of a visualization system, our taxonomy focuses on internal data structures, processing and view, rather than their application toward a task or goal. Furthermore, it is possible to discuss a mixed reality visualization system that is not specific to a given type of surgery (for an example see [39], [29], [36]) or is specific to a given modality rather than type of surgery, as shown in Stetten and Chib [50] where they describe a system that may be used for any ultrasound guided interventions.
5.3 Comparison to IGSTK The image-guided surgery toolkit (IGSTK) is an open source C++ software library that uses a component-based library and provides an environment for fast prototyping of image-guided surgery applications [51]. Here, we briefly look at the overlap of the main components of IGSTK and the DVV taxonomy. The five main components of the IGSTK library are: view, spatial objects, spatial object representation, trackers, and readers [51]. The view, which is the link between the graphical user interface and the rest of the IGSTK toolkit, is where the graphical renderings of data and the surgical scene are rendered [51]. This therefore, overlaps with our notion of both the perception location and display device. The spatial objects are the geometrical representations of the objects in the surgical scene, and thus correspond to the analyzed data component of the DVV taxonomy. The spatial object representation is the visual representation that deals with properties of the geometrical representation such as color, opacity, etc. This ties in with the notion of visually processed data and visualization processing as presented by the DVV taxonomy. Trackers, which handle communication between tracking tools and devices are represented by our taxonomy in two ways. First as rendered data objects that are visualized, most commonly as prior knowledge data in the form of surgical tools and second as hardware tools, which are used to manipulate data and interact with the system. Readers, which bring data into the surgical scene, tie in with the notion of both the data and visualization processing components of the DVV taxonomy. The tracker and reader components, however, are rather low-level development classes which handle data and tracker input, and therefore are not directly represented in our high-level taxonomy. The only DVV component that is not directly represented in the IGSTK library is the software tools which are part of the DVV interface component. Software interface tools which are used to manipulate the spatial object representation are not considered as one of the five main components. However, these would be handled within the classes and state machine of the IGSTK toolkit.
KERSTEN-OERTEL ET AL.: DVV: A TAXONOMY FOR MIXED REALITY VISUALIZATION IN IMAGE GUIDED SURGERY
As we can see there is a direct link between the DVV taxonomy and a commonly used toolkit for developing IGS applications. This suggests that our taxonomy would be useful to developers for both determining software specifications of a system and developing the architecture of the system.
6
DISCUSSION
In the following section, we explore the components of the DVV taxonomy that were not clearly described in the chosen 17 publications. We draw on examples from the literature to demonstrate that the DVV components are present in mixed reality visualization systems for IGS surgery, and therefore are relevant to such a taxonomy. Furthermore, we highlight the importance of carefully considering particular solutions for the components.
6.1 Data: Prior Knowledge and Derived Data While patient specific imaging data were well defined in the 17 chosen publications, few publications described or included prior knowledge or derived data. The prior knowledge data that were described, for the most part, consisted of how surgical tools were visualized for navigation and guidance. In general, tool tracking is a key element in IGS systems. The location of a tool in the real world is determined using optical, acoustic or electromagnetic tracking systems, and a tool avatar is projected onto the preoperative models of the patient (either on screen or onto the actual patient). The visualization of the orientation and position of the tool with respect to the patient anatomy enables guidance in the OR. Typically, tools are visualized as simple wireframe or surface models, however, this simple rendering may be problematic due to the difficulty of locating tools at the proper depth, especially in the case of AR. As suggested by Bichlmeier et al. [52], careful consideration should be given as to how to best represent surgical tools so that they improve rather than confuse localization and navigation. The use of other prior knowledge data such as atlases, labels, or reference anatomies were not mentioned in the systems presented in the selected publications. However, atlas use in IGS is not a novel concept. St Jean et al. [53] developed a visualization platform for planning and guidance of neurosurgical procedures which allowed two- and three-dimensional viewing and manipulation of individual patient MRI data registered to a deformable volumetric atlas of the basal ganglia and thalamus. Nowinski et al. [54] developed an atlas-based neuroimaging system and brain atlas database to support visualization of anatomical structures for real-time navigation. For image-guided liver surgery, Clements et al. [55] proposed the use of a preoperatively computed atlas to deform images of the liver intraoperatively. Derived data, such as uncertainty data were described in one of the selected publications. Linte et al. graphically represented uncertainty based on TRE scores using 95 percent confidence ellipsoids. A small number of other works have explored the visual representation of derived data in the form of errors and uncertainty in both surgical guidance [56] and for 3D visualizations of medical data sets [57], [58].
347
In the field of computer guided orthopaedic surgery, Simpson et al. [56] visualized registration uncertainty by displaying an uncertainty “cone” or volume showing the distribution of possible planned linear paths for target localization. The results demonstrated that uncertainty information visualization for target localization leads to a reduction of objects which are not localized, as well as a decrease in the number of attempts to localize a target. In the field of AR systems, Najafi et al. [59] computed the uncertainty of tracking sensors and visualized them as ellipsoids. The uncertainty allowed them to better estimate the initial pose of objects and improve registration accuracy. Although increasing attention has been given to visually representing uncertainty information, to date this research and interest has been mostly found in academia. The incorporation of uncertainty information about volume data or uncertainty information that arises during image processing or guidance in commercial systems is rare. We are aware of only one commercial system, the Medtronic IGS system, that provides uncertainty visualization in terms of the Target Registration Error volume, which is visually displayed as isocontours. Further clinical experiments should be done in order to show whether uncertainty information is beneficial to the surgeon in performing their tasks. As there are many sources of uncertainty which may affect navigational guidance, such as segmentation, registration, calibration, and tracking, which may in turn may affect a surgeon’s behavior and decision making by increasing the amount of decision support information available to him, we believe that such information should play an important role in IGS systems. This is especially true of mixed reality systems where it is important to quantify and display the quality of alignment between the real and virtual world in order to gain clinical acceptance [60]. Future work should incorporate studies that examine how information about imperfect and incomplete data can be visualized to help surgeons make more informed decisions during intraoperative guidance. The lack of prior knowledge and derived data might suggest that the presented taxonomy does not need to include this type of data to classify systems. However, we believe that the visualization of a surgeon’s implicit knowledge, of surgical flow roadmaps, and of nonpatient specific models may be helpful in guidance and therefore needs further study.
6.2 Visualization Processing In the majority of the papers, simplistic visualization processing was used. The use of simple visualization techniques, such as color coding and transparency, may be problematic as such techniques may not support understanding of the structure and spatial relationships of the data. The use of transparency alone to merge modality data can complicate the perception of relative depth distances and spatial relationships between surfaces [60]. Further, when using stereo-viewing in applications such as AR, it has been shown that the use of transparency makes the perception of the depth of the stereo images ambiguous [61]. More sophisticated techniques are needed for better understanding of and interaction with complex medical and mutlimodel data sets.
348
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS,
Illustrative techniques [62], [63], [64] have been used and studied in medical image visualization, however, very few IGS systems have incorporated such techniques for use in the OR. Hansen et al. [60] and Stoyanov et al. [64] have looked at using illustrative techniques for medical data sets. In an augmented reality system for minimally invasive coronary bypass surgery, Stoyanov et al. [64] used NPR for rendering the virtual object so that it could be perceived at the proper depth. Their motion compensated visualization used ridge enhanced NPR for rendering the embedded virtual object in such a way as to not hinder the visibility of important details from the exposed anatomical surface of the patient. Hansen et al. [60] used distance encoded silhouettes with varying stroke textures and edge thicknesses to allow better perception of important vessels and easier and more accurate judgement of distances between vessels. It is also possible to enhance depth perception and spatial understanding of medical images by adding depth cues. Many researchers have studied the use of adding perceptual cues (e.g., occlusion [65], stereopsis [66], [67], [68], aerial perspective [67]) to computer rendered images. They showed how produced images can allow the user to better comprehend the structure of the data and the depth at which particular elements lie. Yet, many of the results of these works, and new works on how to specifically enhance perception of medical images, have not been used for mixed reality visualization systems in IGS. One exception is the work done by the Computer Aided Medical Procedures and Augmented Reality (CAMP AR) group at the Technische Universita¨t Mu¨nchen (TUM). A large portion of the research at this center has focused on how to improve depth perception of medical images in medical augmented reality [69], [70], [52], [71]. In particular, they have studied the use of texture to introduce the cue of motion parallax [69], [70], the use of an opaque window to introduce the cue of occlusion [69], [70], the use of virtual shadows and shading [52], and the use of glass-like highlights [71] for shape and depth perception. Their work has shown that by improving visualization and interaction techniques, they can improve the perception of the relative depth of virtual objects, the layout of objects in an AR scene, and the interaction and view of objects not directly visible from a user’s point of view. Continued research efforts such as those of CAMP AR, which focus and take advantage of the knowledge of the human visual system, are needed in order to determine how to best display medical images to the end user.
6.3 View: Displays and Interaction In the 17 selected publications, typically, either a HMD (or variation thereof) or monitor were used. Autostereoscopic displays were not one of the solutions presented. However, a number of such displays have been studied in IGS, these include multiview lenticular displays [68], integral videography [72], and holography [73]. For craniofacial procedures, Ko [73] studied the use of projecting 3D holographic images onto the surgical site to serve as a visual template through which surgeons looked at the patient while performing surgery. Liao et al. developed a high resolution autostereoscopic display to present 3D MRI images for use in surgical planning and intraoperative guidance. It is
VOL. 18, NO. 2,
FEBRUARY 2012
possible that, as the cost of autostereoscopic displays falls, the use of these technologies will become more popular. In the selected publications semitransparent or halfsilvered mirrors were not used, although they have been developed for use in AR. Blackwell et al. [74] used a computer generated image that was reflected though a semitransparent or half-silvered mirror such that the generated images were projected onto the patient. The mirror was used in combination with tracked 3D shutter glasses so that the virtual objects could be seen in the correct position and in stereo. Stetten and Chib [50] developed both a traditional mirror overlay system and a portable system where a mirror and a miniflat panel display were attached to an ultrasound probe. Their system allowed for image overlay onto the patient without tracking and arbitrary slice views owing to the free movement of the ultrasound probe. As it can be seen from the variation of displays used in IGS, deciding on a display technology is not trivial. A number of studies have shown that there is no unique solution for which a particular intraoperative display technology should be used and that choosing the appropriate technology is highly application, task, and user dependent [75], [68]. In terms of the interaction component, only half of the systems in the selected publications described virtual interaction tools and only two presented novel hardware solutions. Simitopoulos and Kosaka [29] used a MiscroScribe device, and Mischkowski et al. presented the X-Scope portable LCD screen [33]. As noted in Bichlmeier et al.’s most recent work on “the virtual mirror” [76], [77], classical interaction hardware paradigms, such as the mouse, are not appropriate for solving the interactions between the surgeon and the visually processed data in AR. One work, which looked at finding a novel solution to replace the use of a touchscreen or mouse comes from Fischer et al. [78]. In their work, the users can use surgical tools that are already present and tracked in the operating to perform gestures which are recognized by the tracking system. The user may click on simple menu markers to load patient data, choose to draw points or lines, and change the color of virtual objects. As well, the user can use the surgical tool to draw plans on the patient. The absence of AR in today’s ORs may be due in part to the inability of surgeons to view a region of interest from all desired perspectives [76]. Yet there has been little focus on modernizing and developing new solutions for interacting with visually processed data. Two future avenues for research are possible: 1) development of appropriate hardware devices that can be used to interact with visually processed data in order to easily and efficiently adjust viewing and visualization parameters and 2) creating interfaces that require little to no interaction. While interaction with a system is generally necessary, in the domain of surgical interventions it may interfere and interrupt the surgical workflow. Perhaps, in an ideal IGS system, the view would change without interaction and without disturbing the surgical workflow. A suitable representation of the appropriate data would be presented at any given stage of the surgery for a given task.
KERSTEN-OERTEL ET AL.: DVV: A TAXONOMY FOR MIXED REALITY VISUALIZATION IN IMAGE GUIDED SURGERY
6.4 Surgical Scenario Although we do not specifically classify based on type of surgery, the notion of the surgical scenario allows us to consider which data should by presented and how it should be presented at a particular moment of time during the surgery based on the steps required to complete it. New research efforts have focused on presenting only the necessary data and information at a particular surgical step. This can be done by monitoring all signals and actions performed in the OR [77], [79], by determining a model of surgery which classifies which imaging data are required at each step of the surgery [23], or by classifying microscope images to automatically recognize the current surgical phase [80].
7
CONCLUSIONS
Surgical navigation and guidance are inherently fourdimensional tasks. Surgeons must be able to visualize complex 3D anatomical structures and understand their interrelationships, as well as their orientation and localization in space over the time period of the surgical procedure. Mixed reality visualization has been proposed in IGS to help with 3D perception and to overcome the surgeon’s limited visual field of view. In this work, we have described the DVV taxonomy based on Data type, Visualization processing, and View. Using these three factors, we can describe a system based on what type of data should be visualized, how it should be visualized, at what point in the surgery it should be visualized, and how the user can interact with the data both in terms of manipulation on screen and hardware devices for interaction. By defining such a taxonomy, we believe we provide both developers and users with a tool to build and study mixed reality IGS systems. The terminology and proposed classes depicted in our diagrams can be implemented in order to get the backbone of a mixed reality visualization system. The DVV taxonomy facilitated a complete analysis and comparison of 17 state-of-the-art mixed reality IGS systems. The analysis of the publications brought to light particular patterns, for example, the lack of use of derived and priorknowledge data and the focus in IGS on visualization processing and the interaction component of the view. In doing so, the DVV taxonomy was shown to be useful for finding holes in current research and suggesting avenues of future study in the field. Our examination showed that few of the systems’ components were validated and even fewer systems were evaluated. Validations were typically numerically measured, and only done to test registration or system accuracy as a whole. It is not obvious whether new visualization systems, which provide alternative and novel views of existing information, have added value, given that they do not provide new information. Therefore, evaluation of new visualization systems is crucial and the lack of such evaluations demonstrating the clinical need of these systems may explain the absence of such systems being used in the OR. The view component, and in particular the interaction tools, were not validated nor evaluated in any of the selected publications and few papers considered the
349
visualization processing of the data and how the data could best be represented on screen during the surgery. Unless these components are evaluated/validated to show there is an improvement in patient outcomes, surgery times, etc., there is no need or motivation to transfer the knowledge to commercial systems. The notion of the surgical scenario gives us the ability to take into account the discrete steps of the surgical procedure and helps us to indirectly evaluate a system by examining how the visually processed data and possible interaction at each surgical step affect the completion time and the precision with which each step is accomplished. The DVV taxonomy was found useful for describing the factors which should be considered in developing a system and which must be validated and evaluated before a system is accepted for use in the OR. The DVV taxonomy was shown to fit the IGS literature well and it was found to be complete. We believe that the presented taxonomy is also consistent and concise in that knowledge in the domain of mixed reality systems for IGS has been correctly identified. The presented taxonomy is also expandable. In particular, with the development of new display technologies, interaction paradigms, and visualization processing techniques, new values for the components and classes of the taxonomy will emerge. Differences of the DVV taxonomy and other visualization taxonomies lie in its specificity to mixed reality visualization in IGS and its ability to account for the constraints of the OR, and the constraints of the user. The DVV taxonomy was shown to be useful in defining a common language for researchers and developers to use in discussing mixed reality IGS systems, and therefore serves as guide to structure the work that has been done in this domain. It is useful for comparing, analyzing, and evaluating systems in a consistent manner and therefore, may help bring us a step toward ensuring the successful introduction of more mixed reality IGS systems in the OR.
ACKNOWLEDGMENTS The authors would like to thank Perrine Paul for initial discussions about this work and for providing valuable insights into how to classify mixed reality visualization. This work was supported by grants from the Canadian Institute of Health Reasearch (CIHR MOP-74725) and the Natural Sciences and Engineering Research Council of Canada (NSERC 238739-06 and NSERC CGS).
REFERENCES [1] [2]
[3]
[4]
K. Cleary and T.M. Peters, “Image-Guided Interventions: Technology Review and Clinical Applications,” Ann. Rev. of Biomedical Eng., vol. 12, pp. 119-142, 2010. R. Ewers, K. Schicho, A. Wagner, G. Undt, R. Seemann, M. Figl, and M. Truppe, “Seven Years of Clinical Experience with Teleconsultation in Craniomaxillofacial Surgery,” J. Oral and Maxillofacial Surgery, vol. 63, pp. 1447-1454, Oct. 2005. C. Bichlmeier, B. Ockert, S.M. Heining, A. Ahmadi, and N. Navab, “Stepping into the Operating Theater: Arav—Augmented Reality Aided Vertebroplasty,” Proc. Seventh IEEE/ACM Int’l Symp. Mixed and Augmented Reality (ISMAR ’08), pp. 165-166, 2008. M. Kersten-Oertel, P. Jannin, and D. Collins, “DVV: Towards a Taxonomy for Mixed Reality Visualization in Image Guided Surgery,” Proc. Fifth Int’l Conf. Medical Imaging and Augmented Reality, H. Liao, P. Edwards, X. Pan, Y. Fan, and G.-Z. Yang, eds., vol. 6326, pp. 334-343, 2010.
350
[5]
[6]
[7] [8] [9]
[10]
[11]
[12]
[13]
[14]
[15]
[16] [17]
[18] [19]
[20]
[21]
[22]
[23]
[24]
[25] [26]
[27]
[28]
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS,
P. Milgram and H. Colquhoun, “A Taxonomy of Real and Virtual World Display Integration,” Merging Real and Virtual Worlds, Springer Verlag, 1999. M. Das, F. Sauer, U. Schoepf, A. Khamene, S. Vogt, S. Schaller, R. Kikinis, E. vanSonnenberg, and S. Silverman, “Augmented Reality Visualization for CT-Guided Interventions: System Description, Feasibility, and Initial Evaluation in an Abdominal Phantom,” Radiology, vol. 240, no. 1, pp. 230-235, 2006. R. Azuma, “A Survey of Augmented Reality,” Presence: Teleoperators and Virtual Environments, vol. 6, no. 4, pp. 355-385, 1997. J.H. Shuhaiber, “Augmented Reality in Surgery,” Archives of Surgery, vol. 139, pp. 170-174, 2004. P. Paul, O. Fleig, and P. Jannin, “Augmented Virtuality Based on Stereoscopic Reconstruction in Multimodal Image-Guided Neurosurgery: Methods and Performance Evaluation,” IEEE Trans. Medical Imaging, vol. 24, no. 11, pp. 1500-1511, Nov. 2005. M.C. Chuah and S.F. Roth, “On the Semantics of Interactive Visualizations,” Proc. IEEE Symp. Information Visualization (INFOVIS), pp. 29-36, 1996. M.X. Zhou and S.K. Feiner, “Visual Task Characterization for Automated Visual Discourse Synthesis,” Proc. SIGCHI Conf. Human Factors in Computing Systems (CHI ’98), pp. 392-399, 1998. A. Buja, D. Cook, and D.F. Swayne, “Interactive High-Dimensional Data Visualization,” J. Computational and Graphical Statistics, vol. 5, no. 1, pp. 78-99, 1996. M. Tory and T. Moller, “Rethinking Visualization: A HighLevel Taxonomy,” Proc. IEEE Symp. Information Visualization (INFOVIS ’04), pp. 151-158, 2004. S. Wehrend and C. Lewis, “A Problem-Oriented Classification of Visualization Techniques,” Proc. First IEEE Conf. Visualization (VIS ’90), pp. 139-143, 1990. B. Shneiderman, “The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations,” Proc. IEEE Symp. Visual Languages, pp. 336-343, 1996. J. Bertin, Semiology of Graphics. Univ. of Wisconsin Press, 1983. S. Card and J. Mackinlay, “The Structure of the Information Visualization Design Space,” Proc. IEEE Symp. Information Visualization (INFOVIS ’97), pp. 92-99, Oct. 1997. D.A. Keim, “Visual Exploration of Large Data Sets,” Comm. ACM, vol. 44, no. 8, pp. 38-44, 2001. E. Dubois, L. Nigay, J. Troccaz, O. Chavanon, L. Carrat, and F.D. Mdecine, “Classification Space for Augmented Surgery, an Augmented Reality Case Study,” Proc. INTERACT ’99, pp. 353359, 1999. E. Dubois, P. Gray, and L. Nigay, “Asur++: Supporting the Design of Mobile Mixed Systems,” Interacting with Computers, vol. 14, pp. 497-520, 2003. E. Dubois and P. Gray, “A Design-Oriented Information-Flow Refinement of the Asur Interaction Model,” Proc. Conf. Eng. for Human Computer Interaction (EHCI ’07), 2007. E.H. Chi, “A Taxonomy of Visualization Techniques Using the Data State Reference Model,” Proc. IEEE Symp. Information Visualization (INFOVIS ’00), pp. 69-75, 2000. P. Jannin and X. Morandi, “Surgical Models for ComputerAssisted Neurosurgery,” NeuroImage, vol. 37, no. 3, pp. 738-791, 2007. T. Neumuth, P. Jannin, G. Strauss, J. Meixensberger, and O. Burgert, “Validation of Knowledge Acquisition for Surgical Process Models,” J. Am. Medical Informatics Assoc., vol. 16, pp. 72-80, Jan./Feb. 2009. J. Geng, “Volumetric 3D Display for Radiation Therapy Planning,” J. Display Technology, vol. 4, no. 4, pp. 437-450, Dec. 2008. L.A. Kahrs, H. Hoppe, G. Eggers, J. Raczkowsky, R. Marmulla, and H. Wrn, “Stereoscopic Augmented Reality for Operating Microscopes,” Medicine Meets Virtual Reality 13: The Magical Next Becomes the Medical Now, vol. 111, pp. 243-246, 2005. M. Aschke, C. Wirtz, J. Raczkowsky, H. Wrn, and S. Kunze, “Stereoscopic Augmented Reality for Operating Microscopes,” Proc. Seventh Int’l Congress and Exhibition on Computer Assisted Radiology and Surgery, vol. 1256, pp. 408-413, 2003. C.A. Linte, J. Moore, A. Wiles, J. Lo, C. Wedlake, and T.M. Peters, “In Vitro Cardiac Catheter Navigation Via Augmented Reality Surgical Guidance,” Proc. Soc. of Photo-Optical Instrumentation Engineers (SPIE) Conf. Series, Feb. 2009.
VOL. 18, NO. 2,
FEBRUARY 2012
[29] D. Simitopoulos and A. Kosaka, “An Augmented Reality System for Surgical Navigation,” Proc. Int’l Conf. Augmented, Virtual Environments and 3D Imaging (ICAV3D ’01), pp. 152-156, May 2001. [30] S. Vogt, A. Khamene, and F. Sauer, “Reality Augmentation for Medical Procedures: System Architecture, Single Camera Marker Tracking, and System Evaluation,” Int’l J. Computer Vision, vol. 70, pp. 179-190, http://dx.doi.org/10.1007/s11263-006-7938-1, 2006. [31] N. Glossop, Z. Wang, C. Wedlake, J. Moore, and T. Peters, “Augmented Reality Laser Projection Device for Surgery,” Proc. Medicine Meets Virtual Reality (MMVR) Conf., J.W. et al., eds., pp. 104-110, 2004. [32] K. Konishi, M. Hashizume, M. Nakamoto, Y. Kakeji, I. Yoshino, A. Taketomi, Y. Sato, S. Tamura, and Y. Maehara, “Augmented Reality Navigation System for Endoscopic Surgery Based on Three-Dimensional Ultrasound and Computed Tomography: Application to 20 Clinical Cases,” Int’l Congress Series, cARS 2005: Computer Assisted Radiology and Surgery. http:// www.sciencedirect.com/science/article/B7581-4GFTP32-3N/2/ fb6fb4f ea98d3ad4b823e59d3ae4b355, vol. 1281, pp. 537-542, 2005. [33] R.A. Mischkowski, M.J. Zinser, A.C. Kbler, B. Krug, U. Seifert, and J.E. Zller, “Application of an Augmented Reality Tool for Maxillary Positioning in Orthognathic Surgery—A Feasibility Study,” J. Cranio-Maxillo-Facial Surgery : Official Publication of the European Assoc. for Cranio-Maxillo-Facial Surgery, vol. 34, pp. 478483, Dec. 2006. [34] M. Scheuering, A. Schenk, A. Schneider, B. Preim, and G. Greiner, “Intraoperative Augmented Reality for Minimally Invasive Liver Interventions,” Proc. SPIE Medical Imaging, 2002. [35] N. Suzuki, A. Hattori, K. Tanoue, S. Ieiri, K. Konishi, M. Tomikawa, H. Kenmotsu, and M. Hashizume, “Scorpion Shaped Endoscopic Surgical Robot for Notes and SPS with Augmented Reality Functions,” Proc. Fifth Int’l Conf. Medical Imaging and Augmented Reality, H. Liao, P. Edwards, X. Pan, Y. Fan, and G.-Z. Yang, eds., vol. 6326, pp. 541-550, 2010. [36] M. Wieczorek, A. Aichert, O. Kutter, C. Bichlmeier, J. L, R.M. Heining, E. Euler, N. Navab, and T.U. Mnchen, “GPU-Accelerated Rendering for Medical Augmented Reality in Minimally-Invasive Procedures,” Proc. Bildverarbeitung fuer die Medizin (BVM), 2010. [37] M. Baumhauer, J. Neuhaus, K. Fritzsche, and H.-P. Meinzer, “The MITK Image Guided Therapy Toolkit and Its Application for Augmented Reality in Laparoscopic Prostate Surgery,” Medical Imaging 2010: Visualization, Image-Guided Procedures, and Modeling, K.H. Wong and M.I. Miga, eds., vol. 7625, 2010. [38] T. Kawamata, H. Iseki, T. Shibasaki, and T. Hori, “Endoscopic Augmented Reality Navigation System for Endonasal Transsphenoidal Surgery to Treat Pituitary Tumors: Technical Note,” Neurosurgery, vol. 50, pp. 1393-1397, June 2002. [39] B.W. King, L.A. Reisner, M.D. Klein, G.W. Auner, and A.K. Pandya, “Registered, Sensor-Integrated Virtual Reality for Surgical Applications,” Proc. IEEE Virtual Reality Conf. (VR ’07), pp. 277278, 2007. [40] L. Soler, S. Nicolau, J. Schmid, C. Koehl, J. Marescaux, X. Pennec, and N. Ayache, “Virtual Reality Augmented Reality in Digestive Surgery,” Proc. Third IEEE/ACM Int’l Symp. Mixed and Augmented Reality (ISMAR ’04), pp. 278-279, 2004. [41] W. Birkfellner, M. Figl, C. Matula, J. Hummel, R. Hanel, H. Imhof, F. Wanschitz, A. Wagner, F. Watzinger, and H. Bergmann, “Computer-Enhanced Stereoscopic Vision in a Head-Mounted Operating Binocular,” Physics in Medicine and Biology, vol. 48, no. 3, pp. 49-57, 2003. [42] A.P. King, P.J. Edwards, C.R. Maurer, D.A.d. Cunha, R.P. Gaston, M. Clarkson, D.L.G. Hill, D.J. Hawkes, M.R. Fenlon, A.J. Strong, T.C.S. Cox, and M.J. Gleeson, “Stereo Augmented Reality in the Surgical Microscope,” Presence: Teleoperator Virtual Environment, vol. 9, no. 4, pp. 360-368, 2000. [43] J.P. Rolland and H. Fuchs, “Optical versus Video See-Through Head-Mounted Displays in Medical Visualization,” Presence: Teleoperator Virtual Environment, vol. 9, no. 3, pp. 287-309, 2000. [44] J. Luber and A. Mackevics, “Multiple co-Ordinate Manipulator (MKM): A Computer-Assisted Microscope,” Proc. Symp. Computer Assisted Radiology, pp. 1121-1125, 1995. [45] P. Jannin, X. Morandi, O. Fleig, E. Le Rumeur, P. Toulouse, B. Gibaud, and J.-M. Scarabin, “Integration of Sulcal and Functional Information for Multimodal Neuronavigation,” J. Neurosurgery, vol. 96, no. 4, pp. 713-723, 2002.
KERSTEN-OERTEL ET AL.: DVV: A TAXONOMY FOR MIXED REALITY VISUALIZATION IN IMAGE GUIDED SURGERY
[46] C.A. Linte, J. Moore, A.D. Wiles, C. Wedlake, and T.M. Peters, “Virtual Reality-Enhanced Ultrasound Guidance: A Novel Technique for Intracardiac Interventions,” Computer Aided Surgery, vol. 13, pp. 82-94, Mar. 2008. [47] A. Go`mez-Prez, “Evaluation of Ontologies,” Int’l J. Intelligent Systems, vol. 16, no. 3, pp. 391-409, 2001. [48] C. Brewster, H. Alani, S. Dasmahapatra, and Y. Wilks, “Data Driven Ontology Evaluation,” Proc. Int’l Conf. Language Resources and Evaluation (LREC ’04), 2004. [49] C. Welty and N. Guarino, “Supporting Ontological Analysis of Taxonomic Relationships,” Data and Knowledge Eng., vol. 39, pp. 51-74, 2001. [50] G. Stetten and V. Chib, “Overlaying Ultrasonographic Images on Direct Vision,” J. Ultrasound in Medicine, vol. 20, no. 3, pp. 235-240, 2001. [51] A. Enquobahrie, P. Cheng, K. Gary, L. Ibanez, D. Gobbi, F. Lindseth, Z. Yaniv, S. Aylward, J. Jomier, and K. Cleary, “The Image-Guided Surgery Toolkit Igstk: An Open Source c++ Software Toolkit,” J. Digital Imaging, vol. 20, pp. 21-33, Nov. 2007. [52] C. Bichlmeier, F. Wimmer, S.M. Heining, and N. Navab, “Contextual Anatomic Mimesis Hybrid in-Situ Visualization Method for Improving Multi-Sensory Depth Perception in Medical Augmented Reality,” Proc. Sixth IEEE and ACM Int’l Symp. Mixed and Augmented Reality (ISMAR ’07), pp. 1-10, 2007. [53] P. St Jean, A. Sadikot, D. Collins, D. Clonda, R. Kasrai, A. Evans, and T. Peters, “Automated Atlas Integration: Three-Dimensional Visualization Tools for Planning and Guidance in Functional Neurosurgery,” IEEE Trans. Medical Imaging, vol. 17, no. 5, pp. 672-680, Oct. 1998. [54] W. Nowinski, A. Fang, B. Nguyen, J. Raphel, L. Jagannathan, R. Raghavan, R. Bryan, and G. Miller, “Multiple Brain Atlas Database and Atlas-Based Neuroimaging System,” Computer Aided Surgery, vol. 2, pp. 42-66, 1997. [55] L.W. Clements, P. Dumpuri, W.C. Chapman, R.L. Galloway, and M.I. Miga, “Atlas-Based Method for Model Updating in ImageGuided Liver Surgery,” Medical Imaging 2007: Proc. Conf. Visualization and Image-Guided Procedures, K.R. Cleary and M.I. Miga, eds., vol. 6509, no. 1, p. 650917, 2007. [56] A. Simpson, B. Ma, E. Chen, R. Ellis, and J. Stewart, “Using Registration Uncertainty Visualization in a User Study of a Simple Surgical Task,” Proc. Int’l Conf. Medical Image Computing and Computer-Assisted Intervention (MICCAI), pp. 397-404, 2006. [57] C. Lundstro¨m, P. Ljung, A. Persson, and A. Ynnerman, “Uncertainty Visualization in Medical Volume Rendering Using Probabilistic Animation,” IEEE Trans. Visualization and Computer Graphics, vol. 13, no. 6, pp. 1648-1655, Nov./Dec. 2007. [58] J. Kniss, R. Van Uitert, A. Stephens, G.-S. Li, T. Tasdizen, and C. Hansen, “Statistically Quantitative Volume Visualization,” Proc. IEEE Visualization, pp. 287-294, Oct. 2005. [59] H. Najafi, N. Navab, and G. Klinker, “Automated Initialization for Marker-Less Tracking: A Sensor Fusion Approach,” Proc. Int’l Symp. Mixed and Augmented Reality, Nov. 2004. [60] C. Hansen, J. Wieferich, F. Ritter, C. Rieder, and H.-O. Peitgen, “Illustrative Visualization of 3D Planning Models for Augmented Reality in Liver Surgery,” Int’l J. Computer Assisted Radiology and Surgery, vol. 5, no. 2, pp. 122-141, 2010. [61] L. Johnson, P. Edwards, and D. Hawkes, “Surface Transparency Makes Stereo Overlays Unpredictable: The Implications for Augmented Reality,” Studies in Health Technology and Informatics, vol. 94, pp. 131-136, 2002. [62] C. Tietjen, T. Isenberg, and B. Preim, “Combining Silhouettes, Surface, and Volume Rendering for Surgery Education and Planning,” Proc. Eurographics/IEEE VGTC Symp. Visualization (EUROVIS ’05), 2005. [63] L. Neumann, M. Sbert, B. Gooch, and W. Purgathofer, “Illustrative Visualization for Medical Training,” 2008. [64] D. Stoyanov, G. Mylonas, M. Lerotic, A. Chung, and G.-Z. Yang, “Intra-Operative Visualizations: Perceptual Fidelity and Human Factors,” J. Display Technology, vol. 4, no. 4, pp. 491-501, Dec. 2008. [65] M. Bajura, H. Fuchs, and R. Ohbuchi, “Merging Virtual Objects with the Real World: Seeing Ultrasound Imagery within the Patient,” Proc. ACM SIGGRAPH ’92, pp. 203-210, 1992. [66] J. Inoue, M. Kersten, B. Ma, J. Stewart, J. Rudan, and R. Ellis, “Fast Assessment of Acetabular Coverage Using Stereoscopic Volume Rendering,” Proc. Medicine Meets Virtual Reality (MMVR) Conf., 2006.
351
[67] M. Kersten, J. Stewart, N. Troje, and R. Ellis, “Enhancing Depth Perception in Translucent Volumes,” IEEE Trans. Visualization Computer Graphics, vol. 12, no. 5, pp. 1117-1124, Sept./Oct. 2006. [68] J.R. Cooperstock and G. Wang, “Stereoscopic Display Technologies, Interaction Paradigms, and Rendering Approaches for Neurosurgical Visualization,” Stereoscopic Displays and Applications XX, vol. 7237, no. 1, p. 723703, 2009. [69] C. Bichlmeier and N. Navab, “Virtual Window for Improved Depth Perception in Medical AR,” Proc. Int’l Workshop Augmented Reality Environments for Medical Imaging and Computer-Aided Surgery (AMI-ARCS ’06), 2006. [70] C. Bichlmeier, T. Sielhorst, S.M. Heining, and N. Navab, “Improving Depth Perception in Medical AR: A Virtual Vision Panel to the Inside of the Patient,” Proc. Bildverarbeitung fuer die Medizin (BVM ’07), Mar. 2007. [71] T. Sielhorst, M. Feuerstein, J. Traub, O. Kutter, and N. Navab, “CAMPAR: A Software Framework Guaranteeing Quality for Medical Augmented Reality,” Int’l J. Computer Assisted Radiology and Surgery, vol. 1, no. Supplement 1, pp. 29-30, June 2006. [72] H. Liao, N. Hata, S. Nakajima, M. Iwahara, I. Sakuma, and T. Dohi, “Surgical Navigation by Autostereoscopic Image Overlay of Integral Videography,” IEEE Trans. Information Technology in Biomedicine, vol. 8, no. 2, pp. 114-21, June 2004. [73] K. Ko, “Superimposed Holographic Image-Guided Neurosurgery,” J. Neurosurgery, vol. 88, no. 4, pp. 777-781, 1998. [74] M. Blackwell, C. Nikou, A.M. DiGioia, and T. Kanade, “An Image Overlay System for Medical Data Visualization,” Medical Image Analysis, vol. 4, pp. 67-72, 2000. [75] J. Traub, T. Sielhorst, S.-M. Heining, and N. Navab, “Advanced Display and Visualization Concepts for Image Guided Surgery,” J. Display Technology, vol. 4, no. 4, pp. 483-490, Dec. 2008. [76] C. Bichlmeier, S.M. Heining, M. Feuerstein, and N. Navab, “The Virtual Mirror: A New Interaction Paradigm for Augmented Reality Environments,” IEEE Trans. Medicine Imaging, vol. 28, no. 9, pp. 1498-1510, Sept. 2009. [77] N. Navab, J. Traub, T. Sielhorst, M. Feuerstein, and C. Bichlmeier, “Action- and Workflow-Driven Augmented Reality for Computer-Aided Medical Procedures,” IEEE Computer Graphics and Applications, vol. 27, no. 5, pp. 10-14, Sept./Oct. 2007. [78] J. Fischer, D. Bartz, and W. Straßer, “Intuitive and Lightweight User Interaction for Medical Augmented Reality,” Proc. Vision, Modeling and Visualization (VMV), pp. 375-382, Nov. 2005. [79] N. Padoy, T. Blum, I. Essa, H. Feussner, M. Berger, and N. Navab, “A Boosted Segmentation Method for Surgical Workflow Analysis,” Proc. 10th Int’l Conf. Medical Image Computing and ComputerAssisted Intervention (MICCAI ’07), pp. 102-109, 2007. [80] F. Lalys, L. Riffaud, X. Morandi, and P. Jannin, “Automatic Phases Recognition in Pituitary Surgeries by Microscope Images Classification,” Proc. First Int’l Conf. Information Processing in ComputerAssisted Interventions, N. Navab and P. Jannin, eds., vol. 6135, pp. 34-44, 2010. Marta Kersten-Oertel received the BSc (Honours) degree in computer science and the BA degree in art history from Queen’s University (Kingston) in 2002. In 2005, she completed the MSc degree in computer science at Queen’s University. After working as a research assistant at the GRaphisch-Interaktive Systeme at the University of Tu¨bingen (Germany) she began the PhD degree in 2008 at the Biomedical Engineering at McGill University (Montreal). She is a member of the Image Processing Lab (IPL) at the Brain Institute in the Montreal Neurological Hospital (MNI). Her research is focused on developing and evaluating new visualization techniques for image-guided neurosurgery.
352
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS,
Pierre Jannin received the PhD degree from the University of Rennes in 1988 on multimodal 3D imaging in neurosurgery and the Habilitation de Reserche (HDR) from the University of Rennes in 2005 on information and knowledge assisted neurosurgery. He is a senior INSERM researcher in the Medical School of the University of Rennes (France). He has over 20 years experience in designing and developing image guided surgery systems for neurosurgery. His research topics include image-guided surgery, multimodal imaging data fusion, augmented reality, modeling of surgical procedures, and validation in medical image processing. The main clinical application areas concern functional neurosurgery and surgery of low-grade tumors in central areas.
VOL. 18, NO. 2,
FEBRUARY 2012
D. Louis Collins received the PhD degree in biomedical engineering from McGill University in 1994. He is a scientist in the McConnell Brain Imaging Centre at the Montreal Neurological Institute (MNI) and a professor in the Departments of Neurology and Neurosurgery, and Biomedical Engineering at McGill University (Montreal). His research interests include automated anatomical segmentation and atlasing in a neuroanatomical context for the analysis of normal aging, the effect of neurological diseases on the brain as well as modeling and visualization for planning and guidance of neurosurgery.
. For more information on this or any other computing topic, please visit our Digital Library at www.computer.org/publications/dlib.