update on the zebrafish genome project

UPDATE ON THE ZEBRAFISH GENOME PROJECT Providing Manual Annotation for the Scientific Community J P Almeida-King, S Donaldson, G K Laird, D M Lloyd, H K Sehra, J E Collins, K Howe, B Reimholz, J Torrance, S Trevanion, D Stemple, J G R Gilbert, E Griffiths, J E Loveland, R Storey, J L Harrow, T Hubbard

I Barroso,

Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK

Zv8: The Latest Zebrafish Assembly

Overview The zebrafish genome, which consists of 25 linkage groups and is ~1.4Gb in size, is being sequenced, finished and analysed in its entirety at the Wellcome Trust Sanger Institute to provide an open source, high quality reference genome.

Our latest assembly was generated using 3 genetic maps: HS (meiotic), MGH (meiotic) & T51 (radiation hybrid). Each map was weighted based on the HS > MGH > T51 hierarchy, with one hit/marker/location.

The manual annotation, which is compiled in close collaboration with the Zebrafish Information Network (http://zfin.org/), is provided by the Human and Vertebrate Analysis and Annotation (HAVANA) group and is frequently released onto the Vertebrate Genome Annotation (Vega) database (http://vega.sanger.ac.uk) and may also be viewed as a DAS source in Ensembl (http://www.ensembl.org/Danio_rerio).

The resulting ordered finger print contigs (FPC) will eventually represent our assembly on which our annotation is based. The Ensembl release of Zv8 is due in late spring 2009. 3500

3500

HS Vs Zv7

HS Vs Zv8

2500

2500

We approach annotation using two main strategies. Firstly, the generation of gene lists composed of ZFIN cDNA that map to our current finished assembly. And, secondly, by using whole chromosome clone by clone annotation, where we have annotated over 4200 clones.

2000

The latest zebrafish assembly (Zv8), which represents the most accurate assembly to date is available in Pre-Ensembl (http://pre.ensembl.org/Danio_rerio/Info/Index).

HS Map (cM)

3000

H S M ap ( cM )

Annotation is based on the reference genome assembly sequence, which is derived from a minimal tile path composed of clones finished to a 99.9% sequence standard.

3000

1500

2000

1500

1000

1000

500

500

0 0

200000000

400000000

600000000

800000000

1000000000

Gen o m e ( b p )

1200000000

1400000000

0 0

200000000

400000000

600000000 Genome (bp)

800000000

1000000000

1200000000

The colours represent linkage groups and the dots represent unplaced contigs that fall outside of the assembly. Zv8 is a smaller assembly than Zv7 due to the removal of over 260Mb of haplotypic sequence.

The ZMAP Annotation Interface

Nature Precedings : doi:10.1038/npre.2009.3105.1 : Posted 20 Apr 2009

HAVANA annotation is based on same species and cross-species evidence (cDNA, EST, Protein) sourced from DNA databases and UniProt.

1

Refseq, Pfam, CpG islands, markers, tandem repeats, ab initio gene predictions are also displayed.

Building Genes Using Solexa Data

5 1

HAVANA transcripts

2

Poly-A signal & tail

3

Automated gene predictions

4

Pfam domains

5

Multi-species evidence

Solexa gene models

Orthologue CCNL1 with associated Solexa gene models. 11 potential novel variants are suggested by the models.

4

Solexa gene models are now incorporated alongside our dataset and are used as a guide by annotators. They are derived from over 32Gb of short read pairs obtained by deep sequencing 5 tissue types at 6 stages of development.

3

2 2 The fibroblast growth factor receptor 1 (fgfr1).

Access To Our Sequencing Pipeline

Compara View in Otterlace

http://www.sanger.ac.uk/cgi-bin/humpub/chromoview HUMAN

ZEBRAFISH

MOUSE

Chromoview is a resource tool that enables users to view our Tiling Path Format (TPF) alongside the ‘ A Golden Path’ (AGP) or assembly, which is derived from the TPF. Clones may be queried, providing information on assembly location and status in the pipeline. Variants of fgfr1 in human, zebrafish & mouse.

Community Directed Annotation

Vega: The Manual Annotation Database

As well as our on-going genome annotation we also welcome feedback and annotation requests for genes and regions of interest ([email protected]).

The Vega database is a unique central repository for high quality, frequently updated, manual annotation of vertebrate finished genomic sequence.

We have recently annotated 93 genes to facilitate the production of gene knock-out mutants, with the aim of studying the effect on phenotype in relation to human obesity.

Vega gives an accurate record of your gene of interest, including details of coding and noncoding variants and pseudogenes.

Annotation of the core MHC is also planned, which will complement the human and mouse MHC we have already published

Vega currently includes the full clone annotation of 12 linkage groups and all mapped ZFIN cDNA’ s.

MOUSE

HUMAN

ZEBRAFISH

The FTO gene was annotated in human, mouse and zebrafish as a direct response to a request from the community.

Annotation of the zebrafish MHC will focus on the core region on chromosome 19. Picture obtained from: BMC Genomics. 2005; 6: 152.

MapView showing chromosome 20

Vega release 35

update on the zebrafish genome project

update on the zebrafish genome project

Suggest Documents

Update on ARC Project

Project update

and the Human Genome Project

Zebrafish Resources on the Internet

Zebrafish Resources on the Internet

The Australian Synchrotron Project - Update

The Impact of the Human Genome Project on ... - BioMedSearch

Update on Contraception Awareness Project - Canadian Family ...

Annotation of the zebrafish genome through an ...

The Zebrafish Genome: A Review and msx Gene

Large-scale enhancer detection in the zebrafish genome - Development

Genome Biology - STAR PROJECT

Startup Genome Project - MailChimp

The UCSC Genome Browser database: 2017 update

'Race' and the Human Genome Project - The Race Card Project!

Project Update - CiteSeerX

April 2018 Project update

project update - Northern Pass

Project Update - Westmead Redevelopment

OROM Graphite Project Update

project update - Northern Pass

LCA Project Update

The Eucalyptus grandis Genome Project: Genome and ... - Springer Link

A fruitful outcome to the papaya genome project - Genome Biology