Overview of The 454 Sequencing System

82 downloads 107 Views 3MB Size Report
Library Preparation. 8.0 h. 7.5 h. 4.5 h and 10.5 h. DNA library preparation and titration. emPCR .... A new class of small RNAs was iden&fied: 21U‐. RNAs.
Overview
of
The
454
Sequencing
System


1)
Prepare
Adapter
Ligated
ssDNA
Library
(A‐[insert]‐B)


3)
Load
beads
and
enzymes

 in
PicoTiter
Plate™


CSB2008

August
2008


2)
EmPCR:
Clonal
AmplificaJon
on
28
µ
 beads
followed
by
enrichment


4)
Perform
sequencing‐by‐synthesis
 on
the
454
Sequencer


UCSC
Sequencing
Center


Emulsion
Based
Clonal
AmplificaJon
 A


+
PCR
Reagents

 +

Emulsion
Oil



B


Micro‐reactors



Adapter
carrying

 library
DNA


Mix
DNA
Library
 
&
capture
beads
 (limited
dilu8on)


“Break
micro‐reactors”
 Isolate
DNA
containing
beads



Create

 “Water‐in‐oil”

 emulsion



Perform
emulsion
PCR



•  GeneraJon
of
millions
of
clonally
amplified
sequencing
templates
on
each
bead
 •  No
cloning
and
colony
picking
 CSB2008

August
2008


UCSC
Sequencing
Center


DeposiJng
DNA
Beads
into
the
PicoTiter™Plate


Load
beads
into
 PicoTiter™Plate



Load
Enzyme
 Beads


Centrifuge Step

44
μm


CSB2008

August
2008


UCSC
Sequencing
Center


Sequencing
By
Synthesis
 Sequencing‐By‐Synthesis
   Simultaneous
sequencing
of
the
enJre
genome
in
hundreds
of
 thousands
of
picoliter‐size
wells
   Pyrophosphate
signal
generaJon


DNA
Capture
Bead
 Containing
Millions
of

 Copies
of
a
Single

 Clonal
Fragment





A



A



T



C



G



G



C



A



T



G



C



T



A



A



A



A



G



T



C



A
 T


Sulfurylase


APS


PP
i


Luciferase


ATP


luciferin


Light
+
oxy
luciferin
 CSB2008

August
2008


UCSC
Sequencing
Center


Anneal
Primer


Sequencing
Workflow
Overview Sample input: Genomic DNA, BACs, amplicons, cDNA Generation of small DNA fragments via nebulization Ligation of A/B-Adaptors flanking singlestranded DNA fragments Emulsification of beads and fragments in water-in-oil microreactors

One Fragment

One Bead

Clonal amplification of fragments bound to beads in microreactors Sequencing and base calling

CSB2008

August
2008


UCSC
Sequencing
Center


One Read 400,000 reads per run

Sequencing
Workflow

 Library Preparation


Genome
fragmented
by
 nebulizaJon


sstDNA library created with adaptors A/B fragments selected using streptavidin-biotin purification

DNA library preparation and titration

emPCR

Sequencing

4.5 h

8.0 h

7.5 h

CSB2008

August
2008


and 10.5 h

UCSC
Sequencing
Center


Sequencing
Workflow

 Emulsion PCR Emulsion-based clonal amplification

Anneal sstDNA to an excess of DNA Capture Beads

Emulsify beads and PCR reagents in water-in-oil microreactors

Clonal amplification occurs inside microreactors

Break microreactors, enrich for DNApositive beads

DNA library preparation and titration

emPCR

Sequencing

4.5 h

8.0 h

7.5 h

CSB2008

August
2008


and 10.5 h

UCSC
Sequencing
Center


Sequencing
Workflow
 Loading of PicoTiterPlate Device Depositing DNA beads into the PicoTiterPlate device •  •  • 

Well
diameter:
average
of
44
µm
 >
400,000
reads
obtained
in
parallel
 A
single
clonally
amplified
sstDNA
 bead
is
deposited
per
well


Amplified sstDNA library beads

CSB2008

August
2008


UCSC
Sequencing
Center


Quality filtered bases

Sequencing
Workflow
 Sequencing by Synthesis •  Bases
(TACG)
are
flowed
sequenJally
and
 always
in
the
same
order
(100
Jmes
for
a
 large
GS
FLX
run)
across
the
 PicoTiterPlate
device
during
a
 sequencing
run.
 •  A
nucleoJde
complementary
to
the
 template
strand
generates
a
light
signal.
 •  The
light
signal
is
recorded
by
the
CCD
 camera.
 •  The
signal
strength
is
proporJonal
to
the
 number
of
nucleoJdes
incorporated.


Key sequence

Flowgram

DNA library preparation and titration

emPCR

Sequencing

4.5 h

8.0 h

7.5 h

CSB2008

August
2008


and 10.5 h

UCSC
Sequencing
Center


GS
FLX
Data
Analysis
 Flowgram Generation

4‐ mer


T A C G

Flow
Order


Flowgram


3‐mer


TTCTGCGAA


2‐mer


1‐mer


Key sequence = TCAG for signal calibration CSB2008

August
2008


UCSC
Sequencing
Center


GS
FLX
Data
Analysis
 Overview Image capture

Image processing

Signal processing

GS Run Browser GS De Novo Assembler GS Reference Mapper

GS Amplicon Variant Analyzer

CSB2008

August
2008


UCSC
Sequencing
Center


GS
FLX
System
Performance
 Read Length

CSB2008

August
2008


UCSC
Sequencing
Center


The
Genome
is
comprised
of
repeat
regions
 •  Depending
upon
the
specific
genome
characterisJcs,
 microreads
(~25
bp’s)
cover
only
a
porJon
of
the
genome
 –  In
human
–
25
base
pair
reads
can
only
be
mapped
 uniquely
to
80%
of
the
genome
 •  Short
reads
are
limiJng
in
known
genomes
 
 
 about
unknown
genomes?
 –  Mapping
versus
de
novo
assemblies
 –  Mapping
will
miss
genome
rearrangements
 –  Mapping
is
only
as
good
as
the
reference


CSB2008

August
2008


UCSC
Sequencing
Center



–
What


Why
Does
Length
MaVer?
 • 

Longer
sequencing
reads
mean
more
applicaJons
 –  –  –  –  –  – 

IdenJfy
and
characterize
small
and
short
RNA’s
 Full
length
cDNA
sequencing
for
expression
levels
and
variaJons
 Amplicon
resequencing
for
geneJc
variaJon
including
somaJc
mutaJons
 Sequencing
of
micro‐organisms
in
a
single
instrument
run
 Sequencing
of
complex
genomes
–
mammalian
&
plant
 Sequencing
of
complex
samples
–
Metagenomics,
Ancient
DNA


CSB2008

August
2008


UCSC
Sequencing
Center


Longer
sequencing
reads
mean
more
applica8ons
 –  HIV
Studies
(3)

 –  ChIP‐Sequencing
(8)
 •  Boyle
et
al,
Cell:

Mixed
technologies
for
mapping
open
chromaJn
 –  Metagenomics
(12)
 •  Palacios
et
al,
New
England
Journal
of
Medicine,

Pathogenic
Virus
DetecJon
 –  Whole
Genome
Sequencing
(30)

 •  Velasco
et
al,
PLoS:
Pinot
Noir
Genome
 –  Paired‐End
sequencing
 •  DetecJng
Structural
VariaJons
across
two
human
genomes

 –  Technology
and
BioinformaJcs
(11)
 •  Meyer
et
al,
NAR,:
Using
Picogram
quanJJes
of
sample
 –  Transcriptome
studies
–
cDNA
(17)
 –  Small
RNA
(32)
 –  Amplicon
and
MethylaJon
Studies
(9)


CSB2008

August
2008


UCSC
Sequencing
Center


ApplicaJons
of
Whole
Genome,
Ultra
Broad
 and
Ultra
Deep
HT‐
Sequencing
 HT‐
 Sequencing
 Technology
 Applica8ons

Whole
Genome
 Sequencing Virus Bacteria Fungus Higher
Eukaryotes Human

CSB2008

August
2008


Ultra
Broad
 Sequencing Small
RNA Expression Transcriptome Metagenomics Novel
strain
ID HLA
Typing UCSC
Sequencing
Center


Ultra
Deep Sequencing HIV

Resistance
 Tropism Amplicons Popula8on
Biology Bacterial
16S


The
power
of
Metagenomics
 • 

How
to
IdenJfy
an
environment
based
upon
the
microbial
organisms
that
are
present
 –  Microbial
PopulaJon
Structures
in
the
Deep
Marine
Biosphere
 •  Huber
et
al.,
Science,
318,
p97,
2007


• 

Determining
the
state
of
an
environment
based
upon
the
presence
and
mixture
of
microbial
 organisms
 –  The
interdependence
of
Coral
and
it’s
microbial
environment
 •  Wegley
et
al.,
Environmental
Microbiology,
9,
p2707,
2007



• 

DetecJng
viral
pathogens
–
quickly
and
accurately

 –  Less
than
12
months
from
first
idenJficaJon
of
affected
hives
to
possible
pathogen
 •  Cox‐Foster
et
al.,
Science,
2007
 –  Transplant
vicJms
from
Australia
 •  Palacios
et
al,
New
England
Journal
of
Medicine,
2008


CSB2008

August
2008


UCSC
Sequencing
Center


Transcriptome
Analysis
 Workflow Comparison GS FLX (clonal sequencing ensured through emPCR) emPCR cDNA libraries (short tag library, EST library)

Time: Days Sequencing

Sanger (E. coli cloning, often concatemerization) cDNA libraries (short tag library, EST library)

Concatemerization, insert fragments into vectors and clone into bacteria

CSB2008

August
2008


Sequencing

Grow, pick colonies

Template Generation

UCSC
Sequencing
Center


Time: Weeks

•  Sequencing
of
approximately
400,000
small
 RNAs
from
C.
elegans
 •  Another
18
unknown
miRNA
genes
were
 detected
 •  Thousands
of
endogenous
siRNAs
acJng
 preferenJally
on
transcripts
associated
with
 spermatogenesis
and
transposons
were
 idenJfied
 •  A
new
class
of
small
RNAs
was
idenJfied:
21U‐ RNAs.
They
all
begin
with
an
U
and
are
 precisely
21
nt
long.


CSB2008

August
2008


UCSC
Sequencing
Center


MulJplex
IdenJfier
Basics • 

What
is
it?
 –  Two
new
kits,
each
with
6
different
library
adapters
(total
of
12
adapters)
 –  Each
MID
library
adapter
has
an
added,
specially
encoded
10‐base
region
 –  Used
to
“bar‐code”
up
to
12
different
genomic
library
samples
to
be
run
in
the
 same
region
of
a
single
sequencing
run


Standard
 Library


Seq.
primer


Read


Primer
A
 Key
 #bases:










40




















4



MID

 Library


Seq.
primer


Library
fragment


Primer
B


Read


Primer
A
 Key
 MID
1 #bases:


15











4









10


Library
fragment


Primer
B


Primer
A


Key
 MID
2

Library
fragment


Primer
B


Primer
A


Key
 MID
n

Library
fragment


Primer
B


CSB2008

August
2008


UCSC
Sequencing
Center


Paired‐End
ApplicaJons
 •  ~100
bp
sequencing
tags
separated
by
3
kb
spacing
 •  Use
for
de
novo
assembly
 –  Order
conJgs
 •  Use
for
Structural
VariaJon
studies
 –  Inversions,
DeleJons,
InserJons…
 –  High
resoluJon
detecJon
–
3kb
spacing
vs
10
to
40
kb


CSB2008

August
2008


UCSC
Sequencing
Center


Paired‐Ends
workflow


CSB2008

August
2008


UCSC
Sequencing
Center


Targeted
Enrichment
of
Human
gDNA
 Exon
1


Exon
2


Exon
3
 Exon
4


Exon
5


gDNA


Fragment
and
hybridize
to
 NimbleGen
capture
array


Elute


Analyze
 Exon

 Sequences
 CSB2008

August
2008


UCSC
Sequencing
Center


HT‐Sequencing


Sequencing
all
the
known
exons
from
the
human
 genome
 •  “Direct
selecJon
of
human
genomic
loci
by
microarray
 hybridizaJon,”

 –  Albert
et
al.,
Nature
Methods,
(4)
11,
903
‐905,
2007
 •  ~6,700
gDNA
loci
selected
 •  BRCA1
region


2 MB Region

CSB2008

August
2008


UCSC
Sequencing
Center


Another
Sequence‐Capture
Example
 •  19
Kb
region
from
Chromosome
4
 Targeted
Exons


Seq‐Cap
Array
Probes


GS
FLX
Seq
Reads


Sequencing
Coverage


CSB2008

August
2008


UCSC
Sequencing
Center


Suggest Documents