and, in the meantime, allow cross-domain search requests anyway. Keywords. ... Available from: URL:http://metadata-stds.org/11179/#A3. [3] Owl 2 web ...
23rd International Conference of the European Federation for Medical Informatics User Centred Networked Health Care - A. Moen et al. (Eds.) MIE 2011 / CD / Posters
MOSAIC: Modular Ontology Semantics Architecture for Federated Biobanking Sebastian BARTHOLOMÄUSa,1, Martin LABLANSa, Frank ÜCKERTa a Institute of Medical Informatics, University of Münster, Germany
Abstract. Networking biobanks raises considerable challenges in the field of semantic interoperability: Due to the heterogeneity and continuous evolution of medical domains, a metadata registry with a single common dataset is unlikely to serve the needs of all partners. We are developing a metadata architecture based on ISO 11179 and modular ontologies which allows partners to add metadata items as needed. We employ computer-aided creation, similarity relations and ratings for items to avoid redundancy, facilitate progression towards eventual homogenization and, in the meantime, allow cross-domain search requests anyway. Keywords. Metadata, Ontology, ISO11179, Federation, Biobank, Infrastructure
1. Introduction In biomedical research, samples taken from one biobank alone are often insufficient. Our distributed biobanking strategy aims to provide an IT infrastructure to form a federated network which allows to manage and search biomaterial across organizational borders, while avoiding central components to protect the autonomy and domain-specific peculiarities of each partner. One of the key challenges of this approach is semantic interoperability and metadata management. The infrastructure has to support collaboration across several heterogeneous organizations and specialized medical domains that even evolve over time. This eventually renders any fixed catalogue of metadata items insufficient. But even a Metadata registry (MDR) that manages a collection of metadata items, usually expandable by a small group of metadata experts, is unlikely to serve the needs of everyone (1). It is still a single common view, and with a growing number of partners in a network, the time to reach consensus grows significantly. As a remedy, a metadata architecture is required which allows partners to add new metadata items as needed and still allow crossorganizational search queries today by making them comparable. It also has to provide techniques to avoid redundancy and facilitate progression towards homogenization. 2. Methods To make metadata items comparable, our architecture is based on the ISO/IEC 11179 (2) standard, which provides a precise specification for the definition of metadata items while not proposing a certain implementation. We combine this precision with the 1
Corresponding Author.
flexibility and standardization provided by technologies and tools from the semantic web context, inter alia OWL2 (3). The ontologies are stored and accessed by using the Sesame Framework (4) and the SwiftOWLIM (5) Database.
3. Results We developed a slimmed down ISO 11179-3 ed.3 ontology defining the structure of metadata items, similar to the XMDR project (6). However, XMDR assumes a central metadata authority, which is incompatible to our federated biobanking strategy. The ISO ontology is imported by multiple module ontologies that define basic and domainindependent classes and properties which are annotated with corresponding metadata items. A basic biobanking ontology combines and links these modules. It is in turn the basis for each partners’ local ontology, which contains his domain-specific extensions. Partners can choose to publish their metadata items so that other partners can reuse or map to them. To avoid redundancy, we implement a computer-aided creation process that proposes already existing items to be used instead of a new item. All published items can be rated based on multiple quality dimensions such as their usage compared to synonyms and similar items. These rating values can then be used to present highly rated items prominently to promote the use of established standards and advance towards a homogenous description. In the meantime, the similarity relations between metadata items allow broadening search requests across domains. In our context fuzzy search results are acceptable to a certain degree, as a researcher can always review result sets. 4. Conclusion MOSAIC allows partners of a federated biobanking network to add domain-specific metadata as needed while placing gentle incentives to achieve semantic homogenization and still providing the capability to search across domain borders right away. MOSAIC is used in the pilot implementation of our biobanking software for several international biomedical research networks, including TranSaRNet (7).
References [1] [2] [3] [4] [5] [6] [7]
Rosenthal A, Seligman L, Renner S. From semantic integration to semantics management: Case studies and a way forward. SIGMOD RECORD 2004; 33(4):44–50. Information technology - Metadata Registries (MDR) - Part 3. Final Committee Draft ISO/IEC FCD 11179-3 [cited 2011-4-27]. Available from: URL:http://metadata-stds.org/11179/#A3. Owl 2 web ontology language document overview: W3C Recommendation, 27th October 2009; 2009 [cited 2011-4-29]. Available from: URL:http://www.w3.org/TR/2009/REC-owl2-overview-20091027. Sesame – A framework for processing RDF data. [cited 2011-5-6] URL: http://www.openrdf.org/ SwiftOWLIM – A semantic repository. [cited 2011-5-9] URL: http://www.ontotext.com/owlim The eXtended MetaData Registry Project [cited 2011-4-22]. URL:http://www.xmdr.org/. Dirksen U, Nathrath M, Agelopoulos K et al. 2.O.05 Translational Sarcoma Research Network (TranSaRNet). J Bone Joint Surg Br 2010; 92-B(SUPP_III):437-b.