MIDAS - Multiple Integration of Data Annotation Study Integration of genotype data from NGS and Sanger sequencing with HPO-coded phenotype data in a diagnostic setting
P-ClinG-083
Doelken, SC1*, Eck, SH1, Brumm, R1, Kuecuek S1, Rath S1, Hasselbacher V1, Sofeso C1, Busse B1, Chahrokh-Zadeh S1, Marschall C1, Mayer K1, Rost I1, Klein HG1 *
[email protected] 1Center
for Human Genetics and Laboratory Diagnostics Dr.Klein, Dr.Rost and Colleagues, Lochhamer Str. 29, 82152 Munich, Germany
Introduction In order to integrate Next-generation-sequencing data, Sanger sequencing data and phenotype data in a diagnostic laboratory, we developed a software tool called MIDAS (Multiple Integration of Data Annotation Study), that integrates patient data from the Laboratory Information Management System (LIMS), data from the routine Sanger sequencing workflow, as well as phenotype data based on the Human Phenotype Ontology (HPO) with NGS results. In particular, GenotypePhenotype correlations identified in one patient are made available for all other cases, to aid the interpretation and build a comprehensive knowledge base.
Sequencing and Analysis Pipeline
Data Integration - MIDAS
For the NGS-Panel-analysis, exonic regions of more than 1200 custom selected genes are enriched in parallel by oligonucleotide hybridization and capture (Agilent QXT), followed by massively-parallel sequencing on the Illumina NextSeq. By providing various different templates for smaller subpanels, only genes from the requested indication are selected for data analysis, to limit interpretation to relevant genes, while simultaneously minimizing the possibility of incidental findings. Data analysis is performed using the CLC Genomics Workbench and custom developed Perl scripts (Figure 1). Target regions which fail to reach the designated coverage threshold of 20X are re-analyzed by Sanger sequencing. Identified candidate mutations are independently confirmed. Phenotype information is standardized by Human Phenotype Ontology (HPO) terms.
MIDAS integrates patient data from our LIMS system, data from the routine Sanger sequencing workflow as well as phenotype data, based on the Human Phenotype Ontology (HPO) with the NGS results (Figure 3). In particular, Genotype-Phenotype correlations identified in one patient are made available for all other cases to aid the interpretation and build a comprehensive knowledge base. The data can be queried via a web interface for dynamic data access and filtering.
Reports MIDAS
HPO Interface
NGS Sequencing LIMS Mapping Target Region Statistics
Variant Calling Select Target Regions
CLCbio Genomics Workbench
Analysis Software
NGS-DB
Variant Interface
Analysis Pipeline
Runs Interface
Low Coverage Exons Sanger Seq
NGS
Variant Filter coverage