Vishnoo - An Open-Source Software for Vision Research - IEEE Xplore

2 downloads 80 Views 3MB Size Report
Vishnoo - An Open-Source Software for Vision Research. Enkelejda Tafaj, Thomas C. Kübler, Jörg Peter, Wolfgang Rosenstiel. Wilhelm-Schickard Institute for ...
Vishnoo - An Open-Source Software for Vision Research Enkelejda Tafaj, Thomas C. K¨ubler, J¨org Peter, Wolfgang Rosenstiel Wilhelm-Schickard Institute for Computer Science University of T¨ubingen, Germany [email protected] Martin Bogdan Computer Engineering University of Leipzig, Germany

Abstract The visual input is perhaps the most important sensory information. Understanding its mechanisms as well as the way visual attention arises could be highly beneficial for many tasks involving the analysis of users’ interaction with their environment. We present Vishnoo (Visual Search Examination Tool), an integrated framework that combines configurable search tasks with gaze tracking capabilities, thus enabling the analysis of both, the visual field and the visual attention. Our user studies underpin the viability of such a platform. Vishnoo is an open-source software and is available for download at http://www.vishnoo.de/

Ulrich Schiefer Institute for Ophthalmic Research University of T¨ubingen, Germany

integrates visual search tasks with eye-tracking facilities. Although both commercial eye-trackers e.g. Dikablis [7], Interactive Minds Eye-Tracker [9], EyeGaze [11], SMI [21] or Tobii [23], and open-source solutions e.g. ITU Gaze Tracker [19] provide powerful algorithms for gaze tracking, synchronization of scan paths with stimuli events, analysis and visualization of the visual scan paths, they still do not provide user-friendly interfaces for stimuli generation or task programming. On the other hand there are several tools for stimulus presentations in experimental studies in vision research, neuroscience and psychology, e.g. The Psychtoolbox for Matlab (The MathWorks Inc., Massachusetts, USA) [1], PychoPy [17], SuperLab [4], Presentation [14] or E-Prime [18]. These products offer stimulus delivery and experimental control but they lack of integration of eye-tracking and scan path analysis.

1. Introduction Understanding the visual sensory and attentive mechanisms has been scope of a lot of research in medicine, psychology and engineering. Research on visual field, visual function and attention has benefited especially from the development of eye tracking algorithms and devices. The quantification of eye movements led not only to new diagnostic methods but also to a better understanding of the visual function. Scientific work on vision and visual search is based on the design and configuration of a specific psychophysical task and the analysis of the user’s behavior. A specific task could be for example the localization of a randomly positioned target inbetween distractors representing and simplifying typical visual search activities in everyday life e.g. finding a specific item in a supermarket shelf. Such tasks are often designed only once for a specific scientific study without being reused within the research community. This is mainly due to the lack of a platform that

To combine free-configurable search tasks with gaze tracking and visual scan path analysis we developed Vishnoo (Visual Search Examination Tool). Vishnoo provides mobile campimetry to assess the visual field, four build-in visual search tasks and robust algorithms for eye-tracking and visual scan path analysis. The built-in tasks include a comparative search task, MAFOV as a pop-out task to examine the bottom-up visual processing, a conjunctive search task and a video-plugin for the analysis of the visual processing of (natural scene) images and videos. Recorded scan paths can be efficiently analyzed. The architecture of the software platform is focus of the next section. Section 3 presents how the visual field can be measured using Vishnoo. The search tasks provided by our framework are discussed in Section 4. Section 5 deals with aspects of eye tracking and visual scan path analysis implemented in Vishnoo. Section 6 concludes this paper.

2. Vishnoo architecture Vishnoo is designed with respect to modularity, scalability, flexibility and adaptivity to new stimuli or search tasks. Thus the underlying platform architecture is plugin-based and highly modular as depicted schematically in Figure 1. Each module represents a part of the examination workflow.

Configuration: Vishnoo is designed not only for usage in scientific studies but also for clinical usage. Therefore we provide several configuration possibilities. For example in research studies it might be useful to perform changes of the psychophysical stimuli (e.g. presentation time), while in clinical usage it might be more of interest to have a standardized examination-like task (e.g. in campimetry, Section 3). Stimulus representations as well as input or output devices, e.g. touchscreen or eye-tracking devices, can be loaded as plugins during runtime. This is possible due to encapsulated and well-documented interfaces. Evaluator module: The evaluator module is task specific and provides a variety of useful algorithms and functions for the visualization of the results of a search task. The standard export format is XML, however a plugin interface allows conversion to any other file format. Vishnoo Plugins: Due to well defined interfaces new features (e.g. other devices) and existing or third-party modules for input, visualization and export can be integrated easily. Plugins will be linked dynamically during runtime. Custom input devices or data sources can be used by Vishnoo via the input plugin interface. One example for this usage is the eye-tracking plugin.

Figure 1. Schematic view of Vishnoo’s software architecture Patient information module: This module manages the subjects’s information, e.g. name, ID, date of birth etc. (in case of anonymous handling a subject is characterized only by an ID), information about the examiner, examination date and other settings. This data is further integrated into the examination result. Examination module: An examination is either the assessment of the visual field in case of the mobile campimetry or is one of the four search tasks. It consists of a task model and a set of corresponding configuration values. The task model contains queued individual tasks, which are processed sequentially or in parallel. A task may either be a single simple action, like displaying a stimulus, or a concatenation of subtasks. For example the task of presenting a stimulus may consist of displaying the stimulus, waiting some time, removing the stimulus and waiting for user-feedback (e.g. a button pressed). When implementing new examination types, tasks from existing examinations can be reused which accelerates the design and development of new tasks, avoids redundant code and ensures an overall good software maintenance.

A stimuli plugin delivered with Vishnoo includes various stimuli of simple geometric shapes like squares, circles, triangles, Landolt-rings and annulus with variable sizes and colors. The presentation of images or videos is managed by a video plugin. Due to the OpenGL acceleration, these resources are displayed smoothly even when a large number of stimuli is presented at the same time. Further visualizations can be added either using Nokia’s high level QT library or directly via OpenGL. Exporter module: An export plugin enables the customization of export data formats to meet the individual requirements of Vishnoo users, e.g. when linking Vishnoo results with other, existing frameworks.

3. Mobile campimetry Campimetry is the examination of the visual field on a flat surface (screen). The visual field represents the area that can be perceived when the eye is directed forward. As diseases affecting the visual system result in visual field defects, the systematic measurement and documentation of the visual field is an important diagnostic test. This test consists of measuring the sensitivity mostly in terms of differential luminance sensitivity, of visual perception as a function of location within the visual field [20]. In visual

field examinations test objects, most commonly light stimuli, are projected onto a uniform background and subjects respond by pressing a response button to indicate that they detected the stimulus. The size and location of a collection grid of stimuli is kept constant while their luminance varies until the dimmest stimulus that can be perceived by the subject at each stimulus location is identified. The location and pattern of missed stimuli defines the type of visual field defect. For mobile campimetry Vishnoo integrates the PC-based T¨ubingen Mobile Campimeter that we developed and evaluated in an earlier work [22]. Figure 2 presents a visual field screening result of a healthy right eye with Vishnoo. Dots represent detected stimuli while dark rectangles represent failed stimuli where no light can be perceived. The area that results from the cluster of failed stimuli corresponds to the blind spot. With TMC being suitable for fast screening of the visual field, Vishnoo can be used for diagnosis in ophthalmology and neurology.

Figure 2. Result of the visual field screening with Vishnoo’s task TMC

4

Search Tasks

Vishnoo provides four build-in visual search tasks: MAFOV (Figure 3(a)), a comparative search task (Figure 3(b)), the Machner Test as a conjunctive search task (Figure 3(c)) and a video-based search task (Figure 3(d)). Comparative search tasks are usually used to measure the visual span. The user has to detect a local match or mismatch between two displays that are presented side by side. When searching for a mismatch as in the task presented in Figure 3(b), users can perform very differently depending on the visual scanning strategy they use. Vishnoo enables easy configuration of such tasks. User-feedback and eye-tracking information are integrated automatically.

The video plugin enables the investigation of scanning strategies when natural scene images are presented. This plugin is useful in research studies aiming at the evaluation or further development of established computational models of visual attention, e.g. Itti&Koch [10].

4.1. MAFOV Test The MAFOV (Modified Attended Field Of View) task is a feature search (pop-out) task that can be used to investigate the preattentive mechanisms of visual perception. A Landolt-Ring (stimulus) is presented among annuli (distractors) within thirty degrees of eccentricity. Stimulus and distractors are arranged in a grid, Figure 3(a), that corresponds to a subset of locations defined in the standardized visual field examination grid implemented in the Octopus perimeters (Haag-Streit Inc. Koeniz, Switzerland). During the search task the stimuli are presented randomly and sequentially. When a stimulus is presented all other grid positions are used as distractors and represented by an annular shape of the same size, Figure 3(a). After the presentation time the stimulus is replaced by a distractor. The user then marks the grid position where he believes to have perceived the presented stimulus, Figure 4(b). The information about the presented stimulus, the user’s response behavior, the visual search of the subject as well as the user’s response time is aggregated to compute a final performance score, MAFOV Score. The MAFOV score is a value in the range of one to ten, where ten represents best performance (the grid position where the user response is marked matches perfectly with the position of the presented stimulus). One stands for the failure to detect the stimuli by far. For the computation of this score, the sum of the distances between each stimulus and the corresponding position in the user response is calculated. One dimension we consider in the computation of the MAFOV score is the radial precision. This measure accounts for irregular or spare grids, while naive approaches using Euclidean distance measures will result in unreliable scores. To calculate the radial precision we consider the grid density in the neighborhood of a stimulus. The resulting distance measure expresses the distances relative to the nearest neighbours in the grid. 1 #N eighbours ·

deuc (Stimulus, F eedback) P N eighbours

deuc (Stimulus, N eighbour)

By default the three nearest neighbor stimuli are used to calculate the mean distance. This approach provides good results for both dense and spare grids. An example result of the MAFOV task is depicted in Figure 5.

Currently we are using this task in the field of ophthalmology in different research studies to examine the visual search behavior and exploration capability of subjects with visual field defects. As shape, size, color and arrangement of stimuli and distractors in Vishnoo are free configurable, the user can easily adapt this task to meet his requirements.

(a) MAFOV

(a)

(b) Comparative Search Task

(b)

Figure 4. MAFOV

4.2. Machner Test (c) Conjunction Search Task

(d) Free Search Task

Figure 3. Visual Search Tasks in Vishnoo

The Machner test is a conjunction search task developed by Machner et. al. [13]. Conjunction search represents the process of searching for a target that is defined by a combination of two or more properties, e.g. shape and color. An example task when using the Machner test could be for example to find all red rectangles in the presented image, Figure 3(c). To find these targets the subject has to systematically search the image. In contrary to the MAFOV task, where the subject’s performance depends mainly on the pre-attentive visual perception, in conjunction search the user performance depends above all on higher cognitive functions like the visual search strategy and the spatial working memory.

user, e.g. [2], [16]. In model-based approaches the pupil contour detection is done by iteratively fitting a model, e.g. finding the best-fitting circle or ellipse for the pupil contour [5] or [15]. Model-based techniques provide a more precise estimation of the pupil position on the cost of the computational speed that is crucial especially at high sampling rate (frames per second) and image resolution.

Figure 5. Visualization of a MAFOV result During this task eye-movements are recorded and the visual scan path is analyzed. The analysis comprises of the detection of the number of saccades and the time spent to examine each region, the number of correct answers and the subject’s search strategy. The Machner test as presented here is a specific configuration of a conjunction search tasks that can be designed with Vishnoo. Again, as shape, color and position of targets are free-configurable the user can use the templates of the Machner test to design similar tasks.

In Vishnoo we have integrated Starburst, an hybrid eye-tracking method introduced by [12] that combines feature-based and model-based algorithms for infrared images. After locating and removing the corneal reflection from the image, Starburst locates the pupil edge point using an iterative feature-based technique. In a next step the algorithm fits an ellipse to a subset of the detected edge points using the Random Sample Consensus (RANSAC) paradigm [12], Figure 6. To map the pupil position into coordinated in the scene a second-order polynomial mapping based on a 9-point calibration grid is used. In our setup we use a monochrome USB camera with an infrared filter at a sampling rate of 60 frames per second. As eye-tracking is a plugin in Vishnoo and integrated as a .dll, the Starburst algorithm can easily be replaced by other eye-tracking solutions.

Besides vision research, the Machner test can also be used for rehabilitation in medicine, e.g. training new search strategies in subjects with impaired visual field to help them regain a better perception of their environment.

5. Eye-Tracking and Scan Path Analysis Although there are some well performing commercial eye trackers available, i.e. [7], [11], [21], [23], they suffer mainly from two big disadvantages compared with open-source solutions: they are either available at very high costs and thus unaffordable for many academic research or clinical studies, or delivered as black-box solutions not allowing access to the signal/video processing routines. Nevertheless, for both categories the methods consist of mainly two parts: the extraction of the pupil and its coordinates using image processing methods and the mapping of pupil position coordinates into coordinated in the scene-image. The detection of the pupil is performed mainly either using feature-based or model-based approaches. Featurebased approaches aim at localizing image features related to the position of the eye, e.g. threshold-based techniques where the pupil is detected based on a threshold obtained from image characteristics or manually specified by the

Figure 6. Detection of the pupil by the eyetracking module

Visual Scanpath Analysis Scanpath modelling and analysis in done by a modified version of iComp [8]. Fixations and saccades are identified by calculating movement variances, moving speed between adjacent saccades and fixation duration [3]. Single measurements are clustered to fixations using a Gaussian kernel. Letters are then automatically assigned to fixation clusters and comparison of scanpaths is done by string-editing and alignment algorithms [6]. The final scan path is visualized as ellipses around the fixation centers, where the ellipse dimensions correspond to the variance of eye movements mapped to the fixation, obtained by a principal component analysis.

6. Conclusion Vishnoo is a new platform approach providing a wide range of visual search tasks for easy and fast examination of visual field and visual attention. Vishnoo offers easily adaptable stimulus presentation, eye-tracking and evaluation of the visual scan path combined in a single platform. Up to now Vishnoo is delivered with four highly configurable built-in tasks. The underlying software architecture is modular and plug-in based thus new features like user-specific hardware, database connections, new stimulus types or even new search tasks can be added very easily. Different layers of control and configuration make Vishnoo an attractive choice for both scientific research studies as well as clinical practice. Practical usage was evaluated in cooperation with ophthalmologists and proved the advantages of easy adaption to changing requirements in research studies without the need of long-term development and a fixed examination flow. Currently Vishnoo is being used in research studies to examine the visual search, attention and exploration capabilities of subjects with visual field impairments such as Glaucoma and Hemianopsia.

[7]

[8]

[9]

[10]

[11] [12]

[13]

[14] [15]

The Vishnoo platform is available for download at http://www.vishnoo.de/.

[16]

7. Acknowledgements

[17]

This work was partially supported by the MFG Stiftung Baden-W¨urttemberg and the Wilhelm-Schuler-Stiftung.

[18] [19]

References [1] D. H. Brainard. The Psychophysics Toolbox. Spatial vision, 10(4):433–436, 1997. [2] X. Brolly and J. Mulligan. Implicit calibration of a remote gaze tracker. In IEEE Conference on CVPR Workshop on Object Tracking Beyond the Visible Spectrum, 2004. [3] R. G. Brown and P. Y. C. Hwang. Introduction to Random Signals and Applied Kalman Filtering, volume 2nd ed. John Wiley and Sons, 1997. [4] Cedrus Corporation. http://www.superlab.com, 2011. [5] J. Daugmann. High confidence visual recognition of persons by a test of statistical independence. IEEE Transactions on Pattern Analysis and Machine Intellegence, 15(11):1148– 1161, January 1993. [6] A. T. Duchowski, J. Driver, S. Jolaoso, W. Tan, B. N. Ramey, and A. Robbins. Scanpath comparison revisited. In Proceedings of the 2010 Symposium on Eye-Tracking Research

[20]

[21] [22]

[23]

& Applications, ETRA ’10, pages 219–226, New York, NY, USA, 2010. ACM. Ergoneers GmbH. http://www.ergoneers.com/ de/products/dlab-dikablis/overview.html, 2011. J. Heminghous and A. T. Duchowski. icomp: a tool for scanpath visualization and comparison. In Proceedings of the 3rd symposium on Applied perception in graphics and visualization, APGV ’06, pages 152–152, New York, NY, USA, 2006. ACM. Interactive minds GmbH. http://www.eyegaze. com/content/eyetracking-research-tools, 2011. L. Itti and C. Koch. Computational modelling of visual attention. Nature reviews. Neuroscience, 2(3):194–203, Mar. 2001. LC Technologies, Inc. http://www. interactive-minds.com/, 2011. D. Li, D. Winfield, and D. J. Parkhurst. Starburst: A hybrid algorithm for video-based eye tracking combining featurebased and model-based approaches. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 3:79, 2005. B. Machner, A. Sprenger, D. K¨ompf, T. Sander, W. Heide, H. Kimmig, and C. Helmchen. Visual search disorders beyond pure sensory failure in patients with acute homonymous visual field defects. Neuropsychologia, June 2009. Neurobiological . http://www.neurobs.com, 2011. K. Nishino and S. Nayar. Eyes for relighting. In ACM SIGGRAPH 2004, volume 23, pages 704–711, 2004. T. Ohno, N. Mukawa, and A. Yoshikawa. Freegaze: a gaze tracking system for everyday gaze interaction. In Eye tracking research and applications symposium, pages 15– 22, 2002. J. W. Peirce. PsychoPy–Psychophysics software in Python. Journal of Neuroscience Methods, 162(1-2):8–13, 2007. Psychology Software Tools, Inc. http://www.pstnet. com/eprime.cfm, 2011. J. San Agustin, H. Skovsgaard, E. Mollenbach, M. Barret, M. Tall, D. W. Hansen, and J. P. Hansen. Evaluation of a low-cost open-source gaze tracker. In Proceedings of the 2010 Symposium on Eye-Tracking Research & Applications, ETRA ’10, pages 77–80, New York, NY, USA, 2010. ACM. http://www.gazegroup.org/downloads/ 23-gazetracker/. U. Schiefer, H. Wilhelm, and W. Hart. Clinical NeuroOphthalmology: A Practical Guide. Springer Verlag, Berlin, 1 edition, 2008. SensoMotoric Instruments GmbH. http://www. smivision.com, 2011. E. Tafaj, C. Uebber, J. Dietzsch, U. Schiefer, M. Bogdan, and W. Rosenstiel. Introduction of a portable campimeter based on a laptop/tablet pc. In Proceedings of the 19th Imaging and Perimetry Society (IPS), Spain, 2010. Tobii Technology AB. http://www.tobii.com, 2011.

Suggest Documents