Figure S12. Validation of 3'UTRs using RNA-Seq data

0 downloads 0 Views 262KB Size Report
Blast. Number Of reads aligned. 407424 reads aligned to genome. (62%). Used only poly A/T reads. 0. 7. 14. 21. 28. 35. 42. 49. 56. 63. 70 ï ï ï ï ! Read Length. P.
Coverage_Upstream  -­  Coverage_Downstream

A

PC: 4-­50

50-­100

Bin1

Bin2

100-­200 Bin3

200-­300

300-­500

Bin4

Bin5

B

>  500

Raw Numbers

Percentage

432

100

111

25.6

With  Evidence

84

19.4

Remaining

27

6.25

Bin6

20

Total  candidates Outliers 15

10

IN-OUT

5

*PC:  Peak  coverage 0

C

Processing  Ion  Torrent  Reads

D variable Percentage of Reads

                 Raw  Reads                2353375  (2  M)

Filter  1: Trimming  Poly  A’s  &  Poly  T’s   *Reads  with  poly  A’s  are   trimmed  off. *Reads  with  poly  T’s  are   trimmed  &  reverse   complemented

             Filter  2: Length  filter  >30nt

Processed  Reads: Used  only 662095  reads poly  A/T  reads                                                      considered  as  100% Number  Of  reads  aligned

21341

                 377909

Bowtie2

8174

Blast

Reads  are  aligned  to genome  using   bowtie2  &  blast.

ï

AfterAlignment

ï

ï

ï

!

Read Length

E

Coverage_upstream  -­  Coverage_downstream

Reads  with  A  at  the  end  :   366259  (16%). Reads  with  T’s  at  the  end: 295836  (12.6%).

BeforeAlignment

70 63 56 49 42 35 28 21 14 7 0

20 40

15 10

20 5 0

0

í

51$í6HT

,RQíWRUUHQW

407424  reads  aligned to  genome.  (62%).

Figure S12. Validation of 3’UTRs using RNA-Seq data. A) RNASeq coverage for upstream region (from transcript end to polyadenylation site start) and downstream region (after polyadenylation site) across different bins of polyadenylation site coverage (PC). The difference in the RNA-Seq coverage values in the upstream and the downstream region increases with the increase in PC. B) Table depicting raw number and percentage of transcripts used in validation of 3P-Seq using RNA-Seq data analysis. C) The schematic describing the different steps of pre-processing for reads obtained from Ion torrent platform. D) The binned-frequency of the Ion-torrent read lengths. More than 80% of reads have > 100nts. E) A violin plot depicting the difference in the coverage between the upstream and downstream region derived from RNA-Seq and Ion-Torrent reads.