ensembl2genename Dependencies Usage Example

7 downloads 106750 Views 386KB Size Report
Contact GitHub API Training Shop Blog About. © 2016 GitHub, Inc. ... 24. 25. Raw · Blame · History. #!/bin/bash. #write R script, needs biomaRt echo "#!/usr/bin/ ...
This repository

Search

Pull requests

Issues

Gist

milospjanic / ensembl2genename Code

Issues 0

Unwatch

Pull requests 0

Wiki

Pulse

Graphs

1

Star

0

Fork

0

Settings

Convert human ENSEMBL IDs to gene names — Edit 11 commits

Branch: master

1 branch

0 releases

New pull request

Create new file

1 contributor

Upload files

milospjanic Rename ensemble2genename.sh to ensembl2genename.sh

Find file

Clone or download

Latest commit f2c5a0b on May 13

README.md

Update README.md

4 months ago

ensembl2GeneNameMod2.sh

Rename ensemble2GeneNameMod2.sh to ensembl2GeneNameMod2.sh

4 months ago

ensembl2genename.sh

Rename ensemble2genename.sh to ensembl2genename.sh

4 months ago

README.md

ensembl2genename This is combined bash/R script that will use a file with human ENSEMBL geneIDs in a first column of a file and append a gene name to it, while keeping the structure of the file from other columns. Ensembl2genename sets its host to ensembl.org thus it could be especially useful when biomaRt site is down.

Dependencies Rscript, BiomaRt

Usage chmod 775 ensembl2genename.sh ./ensembl2genename.sh file.txt

Example head file.txt ENSG00000210077 ENSG00000210082 ENSG00000209082 ENSG00000198888 ENSG00000210100 ENSG00000210107 ENSG00000210112 ENSG00000198763 ENSG00000210117 ENSG00000210127

chrM chrM chrM chrM chrM chrM chrM chrM chrM chrM

1602 1671 3230 3307 4263 4329 4402 4470 5512 5587

1670 3229 3304 4262 4331 4400 4469 5511 5579 5655

+ + + + + + + + -

69 1559 75 956 69 72 68 1042 68 69

0 106043 1 29426 0 0 0 15914 0 0

chmod 775 ensembl2genename.sh ./ensembl2genename.sh file.txt

head file.txt.genename

MT-TV ENSG00000210077 MT-RNR2 ENSG00000210082 MT-TL1 ENSG00000209082 MT-ND1 ENSG00000198888 MT-TI ENSG00000210100 MT-TQ ENSG00000210107 MT-TM ENSG00000210112 MT-ND2 ENSG00000198763 MT-TW ENSG00000210117 MT-TA ENSG00000210127

© 2016 GitHub, Inc.

Terms

Privacy

chrM chrM chrM chrM chrM chrM chrM chrM chrM chrM

Security

Status

1602 1671 3230 3307 4263 4329 4402 4470 5512 5587

Help

1670 3229 3304 4262 4331 4400 4469 5511 5579 5655

+ + + + + + + + -

69 1559 75 956 69 72 68 1042 68 69

0 106043 1 29426 0 0 0 15914 0 0

Contact GitHub

API

Training

Shop

Blog

About

This repository

Search

Pull requests

Issues

Gist

milospjanic / ensembl2genename Code

Branch: master

Issues 0

Unwatch

Pull requests 0

Wiki

Pulse

Graphs

1

Star

0

Fork

0

Settings

ensembl2genename / ensembl2genename.sh

Find file

milospjanic Rename ensemble2genename.sh to ensembl2genename.sh

Copy path

f2c5a0b on May 13

1 contributor

26 lines (17 sloc) 1

Raw

655 Bytes

Blame

History

#!/bin/bash

2 3

#write R script, needs biomaRt

4 5

echo "#!/usr/bin/Rscript

6

library(biomaRt)

7

listMarts(host=\"grch37.ensembl.org\")

8

ensembl = useMart(\"ENSEMBL_MART_ENSEMBL\",dataset=\"hsapiens_gene_ensembl\", host=\"grch37.ensembl.org\")

9

id_merge = getBM(attributes=c(\"ensembl_gene_id\",\"external_gene_name\"),mart=ensembl)

10

write.table(id_merge, file=\"id_merge.txt\", sep = \"\t\", quote =F, col.names=F, row.names=F)

11

" > script.r

12 13

#run R script

14 15

chmod 775 script.r

16

./script.r

17 18

#Use awk to append gene names

19 20

awk 'NR==FNR {h[$1] = $1; h2[$1] = $2; next} {print h2[$1], $0}' id_merge.txt $1 >$1.genename

21 22

#remove temporary files

23 24

rm id_merge.txt

25

rm script.r

© 2016 GitHub, Inc.

Terms

Privacy

Security

Status

Help

Contact GitHub

API

Training

Shop

Blog

About

This repository

Search

Pull requests

Issues

Gist

milospjanic / ensembl2genename Code

Issues 0

Branch: master

Unwatch

Pull requests 0

Wiki

Pulse

Graphs

1

Star

0

Fork

0

Settings

ensembl2genename / ensembl2GeneNameMod2.sh

Find file

milospjanic Rename ensemble2GeneNameMod2.sh to ensembl2GeneNameMod2.sh

Copy path

78039d0 on May 13

1 contributor

Executable File 1

26 lines (17 sloc)

Raw

668 Bytes

Blame

History

#!/bin/bash

2 3

#write R script, needs biomaRt

4 5

echo "#!/usr/bin/Rscript

6

library(biomaRt)

7

listMarts(host=\"grch37.ensembl.org\")

8

ensembl = useMart(\"ENSEMBL_MART_ENSEMBL\",dataset=\"hsapiens_gene_ensembl\", host=\"grch37.ensembl.org\")

9

id_merge = getBM(attributes=c(\"ensembl_gene_id\",\"external_gene_name\"),mart=ensembl)

10

write.table(id_merge, file=\"id_merge.txt\", sep = \"\t\", quote =F, col.names=F, row.names=F)

11

" > script.r

12 13

#run R script

14 15

chmod 775 script.r

16

./script.r

17 18

#Use awk to append gene names

19 20

awk 'NR==FNR {h[$1] = $1; h2[$1] = $2; next} {if($1 in h2) print h2[$1], $0}' id_merge.txt $1 >$1.genename

21 22

#remove temporary files

23 24

rm id_merge.txt

25

rm script.r

© 2016 GitHub, Inc.

Terms

Privacy

Security

Status

Help

Contact GitHub

API

Training

Shop

Blog

About