MIRIAM Resources: a robust annotation and

MIRIAM Resources: a robust annotation and crossreferencing framework Camille Laibe

HUPOPSI 2010 Spring Meeting, Seoul, Korea

EBI is an Outstation of the European Molecular Biology Laboratory.

ELIXIR: recommendations Talk outline

Background information BioModels.net MIRIAM Standard Annotations MIRIAM URNs MIRIAM Resources Examples of usage How could all this help you?

BioModels.net Community aiming to improve model annotation, interpretation,

exchange and reuse. Not restricted to any format (eg. SBML, CellML, …) Current projects: A checklist for model annotation: MIRIAM and its supporting infrastructure: MIRIAM Resources A set of relationships (qualifiers) to link model and data: BioModels.net qualifiers An ontology to precise the semantics of models: SBO A public database of published models of biological interest: BioModels Database Several ongoing efforts: MIASE, KiSAO, SEDML, TEDDY, ...

http://biomodels.net/

3

Quantitative biochemical models biochemical model

mathematical model

simulation

computational model Tyson et al (1991) PNAS 88(1):7328-32

MIRIAM

The Minimum Information Required In the Annotation of a Model

http://biomodels.net/miriam/

MIRIAM STANDARD MIRIAM Standard

proposed guidelines for curation and annotation of quantitative models about encoding and annotation applicable to any structured model format cf. Nicolas Le Novère et al. Minimum Information Requested in the Annotation of biochemical Models (MIRIAM). Nature Biotechnology, 2005

MIRIAM Compliance

Models must: be encoded in a public machinereadable format be clearly linked to a single publication reflect the structure of the biological processes described in the reference paper (list of reactions, ...) be instantiable in a simulation (possess initial conditions, ...) be able to reproduce the results given in the reference paper contain creator’s contact details annotated: must unambiguously identify each model constituent

Why are annotations important? Annotations of model components are essential to: unambiguously identify model components improve understanding the structure of the model allow easier comparison of different models ease the integration of models allow efficient search strategies add a semantic layer to the model improve understanding of the biology behind the model allow conversion and reuse of the model ease the integration of model and biological knowledge

Why are annotations important? Annotations of model components are essential to: unambiguously identify model components improve understanding the structure of the model allow easier comparison of different models ease the integration of models allow efficient search strategies add a semantic layer to the model improve understanding of the biology behind the model allow conversion and reuse of the model ease the integration of model and biological knowledge

→ True for any kind of data, not only models!

Why annotations should not be raw text? EMBL bank version 45 (04-DEC-1995 ): /db_xref="PID:g984120" EMBL bank version 47 (07-JUN-1996): /db_xref="PID:g984120" /db_xref="SWISS-PROT:P49581" EMBL bank version 60 (03-SEP-1999): /db_xref="SWISS-PROT:P49581" /protein_id="CAA58766.1" EMBL bank version 73 (30-NOV-2002): /db_xref="SWISS-PROT:P49581" /protein_id="CAA58766.1" /db_xref="GOA:P49581" EMBL bank version 79 (08-JUN-2004): /db_xref="UniProt/Swiss-Prot:P49581" /protein_id="CAA58766.1" /db_xref="GOA:P49581" EMBL bank version 84 (12-SEP-2005): /db_xref="UniProtKB/Swiss-Prot:P49581" /protein_id="CAA58766.1" /db_xref="GOA:P49581"



Not consistent!

Why annotations should not be uncontrolled XML? Extracted from a BioPAX model: CGD CSSM34


What is “CGD”?


What is “CGD”? CGD is the official acronym for: Candida Genome Database Cattle Genome Database Comparative Genomics Database Chronic Granulomatous Disease


What is “CGD”?

Ambiguous! CGD is the official acronym for: Candida Genome Database Cattle Genome Database Comparative Genomics Database Chronic Granulomatous Disease


What is “CGD”?

Ambiguous! CGD is the official acronym for: Candida Genome Database Cattle Genome Database Comparative Genomics Database Chronic Granulomatous Disease Hopefully, things are now changing (cf. Pathway Commons)...

Why annotations should not be uncontrolled URLs?

Minimum information requested in the annotation of biochemical models (MIRIAM) PMID: 16333295

URLs (physical addresses): http://www.ebi.ac.uk/citexplore/citationDetails.do? dataSource=MED&externalId=16333295 http://www.ncbi.nlm.nih.gov/pubmed/16333295 http://www.hubmed.org/display.cgi?uids=16333295 http://srs.ebi.ac.uk/srsbin/cgibin/wgetz?view+MedlineFull+ [medlinePMID:16333295]



Not unique!




Not unique! Not perennial!


Characteristics of a useful identifier Unique an identifier must never be assigned to two different objects; Perennial the identifier is constant and its lifetime is permanent; Standards compliant must conform on existing standards, such as URI; Resolvable identifiers must be able to be transformed into locations of online resources storing the object or information about the object; Free of use everybody should be able to use and create identifiers, freely and at no cost.

MIRIAM URN Data type identifier

Dataset Identifier

URI

text string

Not a URL, not a “Web address”!

Format depends on the resource identified by the data type

22

MIRIAM URN Data type identifier

Dataset Identifier

URI

text string Format depends on the resource identified by the data type

Not a URL, not a “Web address”! UniProt and P62158 (human calmodulin) EC code and 1.1.1.1 (alcohol dehydrogenase)

urn:miriam:uniprot:P62158 urn:miriam:ec-code:1.1.1.1

Gene Ontology and GO:0000186 (activation of MAPKK activity) urn:miriam:obo.go:GO%3A0000186

23

Qualification of the annotation

model element

qualifier

represents

represents

biological entity A

annotation

relationship

biological entity B

Current BioModels.net Qualifiers Current BioModels.net qualifiers Current BioModels.net qualifiers

bqmodel:is

bqbiol:isPropertyOf

bqmodel:isDerivedFrom

bqbiol:isVersionOf

bqmodel:isDescribedBy

bqbiol:hasVersion

bqbiol:is bqbiol:isDescribedBy bqbiol:hasPart bqbiol:hasProperty bqbiol:isPartOf

bqbiol:isHomologTo bqbiol:isDescribedBy bqbiol:encodes bqbiol:isEncodedBy bqbiol:occursIn ...

http://biomodels.net/qualifiers/ 25

SBML and MIRIAM

SBML and MIRIAM

MIRIAM Resources

–

browsing

–

searching

–

editing

–

export (XML)

–

Web Services

–

documentation

http://www.ebi.ac.uk/miriam/ Open Source: available on SourceForge.net

Camille Laibe and Nicolas Le Novère. MIRIAM Resources: tools to generate and resolve robust crossreferences in Systems Biology. BMC Systems Biology, 2007

MIRIAM Resources overview

MIRIAM Resources: http://www.ebi.ac.uk/miriam/

MIRIAM data types

MIRIAM data type

MIRIAM data type

Resources

MIRIAM resources reliability

MIRIAM resources reliability

Applications: whole cell metabolic models Yeast Metabolic Model Herrgård M.J., Swainston N., Dobson P., Dunn W.B., Arga K.V., Arvas M., Blüthgen N., Borger S., Costenoble R, Heinemann M., Hucka M., Le Novère N., Li P., Liebermeister W., Mo M.L., Oliveira A.P., Petranovic D., Pettifer S., Simeonidis E., Smallbone K., Spasic I., Weichart D., Brent R., Broomhead D.S., Westerhoff H.V., Kırdar B., Penttilä M., Klipp E., Palsson B.Ø., Sauer U., Oliver S.G., Mendes P., Nielsen J., Kell D.B. (2008) A consensus yeast metabolic network reconstruction obtained from a community approach to systems biology. Nature Biotechnology, 26: 1155 1160. 2152 species, 1857 reactions, 4861 MIRIAM annotations Human Metabolic Model 4889 species, 8866 reactions, 66968 MIRIAM annotations

Models clustering

Krause F, Liebermeister W (2009) A simple clustering of the BioModels database using semanticSBML. Nature Precedings, doi:10.1038/npre.2009.3444.1

Models comparison

http://www.semanticsbml.org/

Tools developing support for MIRIAM URNs Data resources

Application software

BioModels Database (kinetic models)

ARCADIA (graph editor)

Pathway Commons (BioPAX)

BIOUML (modelling and simulation)

Physiome Model Repository (CellML)

COPASI (Simulation)

SABIORK (reaction kinetics)

LibAnnotationSBML

Yeast consensus model database

LibSBML

Human consensus model database

SAINT (semantic annotation)

EMeP (structural genomics)

SBML2BioPAX SBML2LaTeX SBMLeditor (model editor) SemanticSBML (annotation, merging, comparison, ...) Snazer (network analysis, simulations) Systems Biology Workbench (model design and simulation) The Virtual Cell (simulation)

HUPOPSI potential usage Intact (PSI-MI 2.5):

Pride: [...]

HUPOPSI potential usage Intact (PSI-MI 2.5):

Pride: [...]

Possible cooperation?

Proteomics Standards Initiative Controlled Vocabularies Released October, 2006 Last maintenance update, April 2009 http://psidev.info/index.php?q=node/159

“As common reference system for databases MIRIAM resource is recommended.”

Acknowledgements

Nicolas Le Novère Nick Juty Camille Laibe

Henning Hermjakob Samuel Kerrien Juan A Vizcaino Sandra Orchard Luisa MontecchiPalazzi

The community of computational systems biology for the development of MIRIAM and the implementation of MIRIAM support Data providers who replied, discussed and even complied with MIRIAM rules

[email protected]

43

Requirements for a MIRIAM compliant data type

Open access Anybody can access any public data without restriction (no commercial licence, no login page, ...) Atomicity The granularity of the data distributed has to be appropriately selected (a database of “reactions” distributes reactions and not pathways) and consistent (e.g. classes or instances but not classes AND instances) Identifier An atomic data is associated to a unique and perennial identifier Community recognition The resource has to be “recognised” by the corresponding experimental community, be reasonably supported, ...

MIRIAM Resources: a robust annotation and

MIRIAM Resources: a robust annotation and

Suggest Documents

MIRIAM Resources - EMBL-EBI

Identifiers.org and MIRIAM Registry: community resources ... - CiteSeerX

SpatialML: Annotation Scheme, Resources, and ... - Semantic Scholar

A Weighted Robust Parsing Approach to Semantic Annotation

Language resources for semantic document annotation ... - CiteSeerX

SEMANTIC ANNOTATION FRAMEWORK FOR WEB RESOURCES

UROPA: a tool for Universal RObust Peak Annotation - Nature

A Weighted Robust Parsing Approach to Semantic Annotation

MIRIAM ELEY

Annotation for and Robust Parsing of Discourse ... - Semantic Scholar

Robust Learning-Based Parsing and Annotation of ... - Google Sites

Robust Learning-Based Parsing and Annotation of ... - Google Sites

Efficient and Robust Annotation of Motion Capture Data - CiteSeerX

Efficient and Robust Annotation of Motion Capture Data - CiteSeerX

Semantic Annotation and Search for resources in the next ... - W3C

Semantic Annotation and Search for resources in the next Generation ...

Semantic Annotation and Search for resources in the next Generation ...

Language Resources and Annotation Tools for ... - LREC Conferences

miriam cooke

Miriam Makeba - Career UCSF

ProbSim-Annotation: A novel image annotation

Miriam CabrÃ© - WordPress.com

Miriam Truffa Giachet

Miriam CabrÃ© - WordPress.com