Developing and using the Neuroimaging and Data Sharing Data ...

54 downloads 70270 Views 1MB Size Report
We have begun development of detailed specifications of the core NI- ... Statistics, Coventry, United Kingdom, 7University of Southern California, Los Angeles, CA, 8Medical ... There's an app for that: a semantic data provenance framework for ...
Developing and using the Neuroimaging and Data Sharing Data Model: the NIDASH Working Group David Keator1, Satrajit Ghosh2, Camille Maumet3, Guillaume Flandin4, Nolan Nichols5, Thomas Nichols6, Gully Burns7, Rüdiger Brüehl8, Cameron Craddock9, Blaise Frederick10, Krzysztof Gorgolewski11, Daniel Marcus12, Michael Hanke13, Christian Haselgrove14, Karl Helmer15, Arno Klein16, Michael Milham17, Russell Poldrack18, Franck Michel19, Jason Steffener20, Yannick Schwartz21, Rich Stoner22, Jessica Turner23, David Kennedy24, Jean-Baptiste Poline25

1University

of California Irvine, Irvine, CA, 2MIT, Cambridge, MA, 3University of Warwick, Coventry, United Kingdom, 4UCL Institute of Neurology, London, United Kingdom, 5University of Washington, Seattle, WA, 6University of Warwick, Dept. of Statistics, Coventry, United Kingdom, 7University of Southern California, Los Angeles, CA, 8Medical Metrology, PhysikalischTechnische Bundesanstalt, Berlin, Germany, 9Child Mind Institute, New York, NY, 10McLean Hospital, Harvard Medical School, Belmont, MA, 11Max Planck Institute for Human Brain and Cognitive Sciences, Leipzig, Germany, 12Neuroimaging Informatics and Analysis Center at Washington University, St. Louis, LA, 13Otto-von-Guericke Universität, Magdeburg, Germany, 14UMass Medical School, Worcester, MA, 15Massachusetts General Hospital, Charlestown, MA, 16columbia university, New York, United States, 17Nathan Kline Institute for Psychiatric Research, New York, NY, 18UT Austin, Austin, United States, 19Centre national de la recherche scientifique, Paris, France, 20Columbia University, New York, United States, 21CEA, France, 22University of California San Diego, San Diego, CA, 23Georgia State University, Atlanta, United States, 24University of Massachusetts Medical School, Worcester, MA, 25Helen Wills Neuroscience Institute, UC Berkeley, CA, USA

Acknowledgments: We would like to acknowledge the work of all the INCF task force members as well as of many other colleagues who have helped the task force. We are particularly indebted to Mathew Abrams, Linda Lanyon, Roman Valls Guimera and Sean Hill for their support at the INCF. Further we acknowledge the long-standing support of Derived Data Working Group activities by the BIRN coordinating center (NIH 1 U24 RR025736-01), and the Wellcome Trust for support of CM & TEN.

Introduction

Results

In neuroimaging, data sharing remains an exception [1]. While other disciplines (e.g. bioinformatics) require data be made available to publish a paper, in neuroimaging, data sharing is mainly confined to sharing the paper itself.

NIDASH DM working group has developed terminologies for DICOM [6,7] and neuroimaging terms [2],and the Neuroimaging Data Model (NI-DM) [2,5]. NI-DM is a neuroimaging-specific extension of the PROV Data Model [11] to facilitate sharing of semantically meaningful neuroimaging provenance and derived data.

The Biomedical Informatics Research Network (BIRN) Derived Data Working Group (DDWG) [10] recently joined forces with the Neuroimaging Data Sharing Task Force (NIDASH-TF) formed by the International Neuroinformatics Coordinating Facility’s (INCF) Program on Standards for Data Sharing [9] supporting widespread publication and use of provenance, derived data, and resources in neuroimaging. The NIDASH-TF is working to facilitate sharing of neuroimaging data in a variety of ways, whether raw or derived data and its provenance, lexical information, or exchange formats, while taking into account legal and ethical considerations.

Methods The NIDASH Data Model (DM) working group holds weekly calls with participating members from the international community as well as several INCF hosted meetings per year.

Using these tools, we have developed applications to federate data across relational databases and Excel spreadsheets [4], visualizing FreeSurfer segmentations across a large cohort [3], and modeling SPM generated statistical results [8].

The INCF wiki is the primary resource for disseminating information and contains weekly minutes, publications, and links to products.

We have begun development of detailed specifications of the core NIDM standard, an extension of the W3C PROV model, and "object models" specifying the recommended set of entities, agents, and activities to describe a workflow and/or derived data product: http://nidm.nidash.org/specs/

http://wiki.incf.org/mediawiki/ index.php/Neuroimaging_Task_Force

We have also developed and deployed a website for sharing raw statistical maps (NeuroVault.org) which will use NI-DM. The INCF-TF meetings have encouraged adoption of these resources. We are linking this work with projects that are providing data, hosting data, developing lexicons or ontologies, and generating derived data. The group includes developers and remains in close contact with projects that plan to use the developed resources, or may do so in the future when articles will be linked to actual data analysis repository (e.g., Neurosynth, Neurovault, Brainspell), as well as integration platforms such as NeuroDebian.

NIDASH code is made available through the “NI-” GitHub organisation. https://github.com/ni-

NI-DM Data Model Site provides specification documents and examples for NI-DM end-users. http://nidm.nidash.org

The NIDASH-TF meets several times a year to review progress on various projects that will make data sharing easier and more fruitful for the scientific community.

Conclusions The future short-term goals of the NIDASH DM working group are to: 1.  Refine existing terminologies and object models. 2.  Work with existing software packages to incorporate the data model (e.g. SPM) 3.  Create similar models for related tools (e.g., FSL, AFNI) so that common aspects across software packages can be identified 4.  Facilitate broad and expanded use of the NI-DM standard for data querying and data exchange, fostering applications such as meta analyses.

References [1] Poline J.B., Breeze J., Ghosh S., et al. Data sharing in neuroimaging research. Frontiers in Neuroinformatics. 2012; 6:9. [2] Keator D.B., Helmer K., Steffener J., et al. Towards structured sharing of raw and derived neuroimaging data across existing resources. Neuroimage. 2013 Nov 15;82:647-61 [3] Nichols B.N., Stoner R., Keator D.B., et al. There’s an app for that: a semantic data provenance framework for reproducible brain imaging. Human Brain Mapping, Seattle, WA. 2013. [4] Nichols B.N., Steffener J., Haselgrove C., et al. Mapping Neuroimaging Resources into the NIDASH Data Model for Federated Information Retrieval. Neuroinformatics 2013, Stockholm, Sweden. 2013. [5] Ghosh S., Nichols B. N., Gadde S., et al. XCEDE-DM: A neuroimaging extension to the W3C provenance data model. Poster presentation at Neuro-Informatics Congress.Munich,Germany 2012. [6] K.G. Helmer, S. Ghosh, B.N. Nichols, et al. INCF Neuroscience2012, Munich, Germany, 2012. [7] K.G. Helmer, S. Ghosh, D. Keator, et al. The Addition of Neuroimaging Acquisition, Processing and Analysis Terms to Neurolex. Human Brain Mapping, Hamburg, Germany. 2014. [8] C. Maumet, T. Nichols, B.N. Nichols, et al. Extending NI-DM to share the results and provenance of a neuroimaging study: an example with SPM. Human Brain Mapping, Hamburg, Germany. 2014. [9] http://www.incf.org/core/programs/datasharing [10] https://wiki.birncommunity.org