Gene Prediction Tutorials

94 downloads 38087 Views 2MB Size Report
No software installation is required. Handles ... Creating login id for the MAKER Web Annotation Service (MWAS). Create a login id ... Click to see steps that.
Gene Prediction Tutorials Abhishek Kumar Nov 2014

Gene Prediction Tutorial

1

Why gene prediction • To know where are genes in your genome • To know how many genes • To know how much coding and non-coding segments are present in your genome

Gene Prediction Tutorial

2

MAKER 2.0

 MAKER is an easy-to-use genome annotation pipeline  MAKER can be used for de novo genome annotation

MAKER2.0

Linux based

Web based

Requires good understanding of linux

simple

Large number of linux based software dependent

No software installation is required

Handles genomes of any size

Handles genomes of 10 Mb size, Dividing your genomic fasta file into 10 Mb pieces and joining will work

Gene Prediction Tutorial

3

MAKER only works with Linux/Mac OSX 1.Requires large sets of modules and programs

2. Basic installation after installing modules and programs

Notes for Max OSX users

Gene Prediction Tutorial

4

Creating login id for the MAKER Web Annotation Service (MWAS)

Create a login id

Gene Prediction Tutorial

5

Logging into the MAKER Web Annotation Service (MWAS)

User name Password

Gene Prediction Tutorial

6

After logging the MAKER Web Annotation Service (MWAS)

3. Status of running jobs

2. Running new jobs

1. Files uploading

4. Finished Jobs

5. Click to see parameters

Use 7zip to open this file

7. Click to see/saving results as JobID.tar.gz 6. Click to see steps that Gene Prediction Tutorial Maker performed

7

Uploading files into the MAKER Web Annotation Service (MWAS)

You can load fasta files and SNAP HMM file from provided folder

If your uploads are done, you will see files like here

Gene Prediction Tutorial

8

Starting new job in the MAKER Web Annotation Service (MWAS)

1.

2.

Select eukaryotic

Select your genome, example genome4 Gene Prediction Tutorial

9

Starting new job in the MAKER Web Annotation Service (MWAS)

3. An expressed sequence tag or EST is a short sub-sequence of a cDNA sequence, or partial cDNA, Available via ESTdb http://www.ncbi.nlm.nih.gov/dbEST/ If someone has deposited

Let us assume we don’t have evidences for our species

Hence fill nothing in EST evidences

Gene Prediction Tutorial

10

Starting new job in the MAKER Web Annotation Service (MWAS)

4.

Select UniProt/Swissprot

This option will be not needed now, If you will your own protein for a genome, you need to supply this

Gene Prediction Tutorial

11

Starting new job in the MAKER Web Annotation Service (MWAS) 5.

Select RepeatRunner te_protein This option will be not needed now, If you have your own library of repeats, it is needed if you are running specialized repeats to specific genomes, Or else all known repeats are provided here by default

Gene Prediction Tutorial

12

Starting new job in the MAKER Web Annotation Service (MWAS) 6.

Select uploaded SNAP-HMM file

Uploaded SNAP-HMM file

This option will be not needed now,

Gene Prediction Tutorial

13

Starting new job in the MAKER Web Annotation Service (MWAS)

7.

Select Y

Jobs starts as soon as jobs submitted by others before you finished You will see results upon ready Gene Prediction Tutorial

14

Seeing results in the MAKER Web Annotation Service (MWAS)

Log shows what maker has done with genomic data in order to finish the task

All jobs are listed Gene Prediction Tutorial

15

Seeing results in the MAKER Web Annotation Service (MWAS)

Result available for downloading and visualizing into different Genome browsers

All jobs are listed Gene Prediction Tutorial

16

Seeing results in the MAKER Web Annotation Service (MWAS)

Result available for downloading and visualizing into different Genome browsers Visualization is highly dependent on tools properly installed, JAVA has proper security All jobs are listed Gene Prediction Tutorial

17

Seeing results in the MAKER Web Annotation Service (MWAS)

Result available for downloading and visualizing into different Genome browsers

All jobs are listed Gene Prediction Tutorial

18

Seeing results in the MAKER Web Annotation Service (MWAS)

Unzip File structure

Saved file is JobID.tar. gz transcripts

Proteins

First 40 proteins are taken and provided here for annotation using BLAST2GO and Pfam domain scanning http://tinyurl.com/40proteins Gene Prediction Tutorial

19

Augustus gene prediction on the web Output

Paste your sequence

Choose your organism

http://bioinf.uni-greifswald.de/augustus/submission.php

Gene Prediction Tutorial

20

Augustus gene prediction on the linux installation

Download augustus install, configure on the linux

Linux terminal $augustus --species=aspergillus_nidulans --strand=both --singlestrand=false -genemodel=partial --codingseq=on --sample=100 --keep_viterbi=true --alternativesfrom-sampling=true --minexonintronprob=0.2 --minmeanexonintronprob=0.5 -maxtracks=input.fa --exonnames=on Same results as on previous page

Gene Prediction Tutorial

21

Genemark on the web Output

Needs only DNA as input

http://exon.gatech.edu/GeneMark/heuristic_gmhmmp.cgi

Breaks DNA one gene into several pieces, as same input as of Augustus was provided, poor performance

Gene Prediction Tutorial

22

FGENESH gene prediction on the web Output

http://www.softberry.com/berry.phtml?topic=fgenesh&group=progra ms&subgroup=gfind Good performance As only one coding region was expected

Gene Prediction Tutorial

23

Gene prediction using Genewise 2

Output

Protein sequence

DNA sequence

 

Good for confirming accuracy of previously predicted gene as needs both DNA and protein sequence Aligns DNA to cDNA region

http://www.ebi.ac.uk/Tools/psa/genewise/

Gene Prediction Tutorial

24

Take home message • Maker 2.0 does reasonably good • Augustus is best gene prediction tool, followed by SNAP because they use species specific parameters. • Trying out several methods is good learning and optimization for your purpose Gene Prediction Tutorial

25