Library Preparation. 8.0 h. 7.5 h. 4.5 h and 10.5 h. DNA library preparation and
titration. emPCR .... A new class of small RNAs was iden&fied: 21U‐. RNAs.
Overview
of
The
454
Sequencing
System
1)
Prepare
Adapter
Ligated
ssDNA
Library
(A‐[insert]‐B)
3)
Load
beads
and
enzymes
in
PicoTiter
Plate™
CSB2008
August
2008
2)
EmPCR:
Clonal
AmplificaJon
on
28
µ
beads
followed
by
enrichment
4)
Perform
sequencing‐by‐synthesis
on
the
454
Sequencer
UCSC
Sequencing
Center
Emulsion
Based
Clonal
AmplificaJon
A
+
PCR
Reagents
+
Emulsion
Oil
B
Micro‐reactors
Adapter
carrying
library
DNA
Mix
DNA
Library
&
capture
beads
(limited
dilu8on)
“Break
micro‐reactors”
Isolate
DNA
containing
beads
Create
“Water‐in‐oil”
emulsion
Perform
emulsion
PCR
• GeneraJon
of
millions
of
clonally
amplified
sequencing
templates
on
each
bead
• No
cloning
and
colony
picking
CSB2008
August
2008
UCSC
Sequencing
Center
DeposiJng
DNA
Beads
into
the
PicoTiter™Plate
Load
beads
into
PicoTiter™Plate
Load
Enzyme
Beads
Centrifuge Step
44
μm
CSB2008
August
2008
UCSC
Sequencing
Center
Sequencing
By
Synthesis
Sequencing‐By‐Synthesis
Simultaneous
sequencing
of
the
enJre
genome
in
hundreds
of
thousands
of
picoliter‐size
wells
Pyrophosphate
signal
generaJon
DNA
Capture
Bead
Containing
Millions
of
Copies
of
a
Single
Clonal
Fragment
A
A
T
C
G
G
C
A
T
G
C
T
A
A
A
A
G
T
C
A
T
Sulfurylase
APS
PP
i
Luciferase
ATP
luciferin
Light
+
oxy
luciferin
CSB2008
August
2008
UCSC
Sequencing
Center
Anneal
Primer
Sequencing
Workflow
Overview Sample input: Genomic DNA, BACs, amplicons, cDNA Generation of small DNA fragments via nebulization Ligation of A/B-Adaptors flanking singlestranded DNA fragments Emulsification of beads and fragments in water-in-oil microreactors
One Fragment
One Bead
Clonal amplification of fragments bound to beads in microreactors Sequencing and base calling
CSB2008
August
2008
UCSC
Sequencing
Center
One Read 400,000 reads per run
Sequencing
Workflow
Library Preparation
Genome
fragmented
by
nebulizaJon
sstDNA library created with adaptors A/B fragments selected using streptavidin-biotin purification
DNA library preparation and titration
emPCR
Sequencing
4.5 h
8.0 h
7.5 h
CSB2008
August
2008
and 10.5 h
UCSC
Sequencing
Center
Sequencing
Workflow
Emulsion PCR Emulsion-based clonal amplification
Anneal sstDNA to an excess of DNA Capture Beads
Emulsify beads and PCR reagents in water-in-oil microreactors
Clonal amplification occurs inside microreactors
Break microreactors, enrich for DNApositive beads
DNA library preparation and titration
emPCR
Sequencing
4.5 h
8.0 h
7.5 h
CSB2008
August
2008
and 10.5 h
UCSC
Sequencing
Center
Sequencing
Workflow
Loading of PicoTiterPlate Device Depositing DNA beads into the PicoTiterPlate device • • •
Well
diameter:
average
of
44
µm
>
400,000
reads
obtained
in
parallel
A
single
clonally
amplified
sstDNA
bead
is
deposited
per
well
Amplified sstDNA library beads
CSB2008
August
2008
UCSC
Sequencing
Center
Quality filtered bases
Sequencing
Workflow
Sequencing by Synthesis • Bases
(TACG)
are
flowed
sequenJally
and
always
in
the
same
order
(100
Jmes
for
a
large
GS
FLX
run)
across
the
PicoTiterPlate
device
during
a
sequencing
run.
• A
nucleoJde
complementary
to
the
template
strand
generates
a
light
signal.
• The
light
signal
is
recorded
by
the
CCD
camera.
• The
signal
strength
is
proporJonal
to
the
number
of
nucleoJdes
incorporated.
Key sequence
Flowgram
DNA library preparation and titration
emPCR
Sequencing
4.5 h
8.0 h
7.5 h
CSB2008
August
2008
and 10.5 h
UCSC
Sequencing
Center
GS
FLX
Data
Analysis
Flowgram Generation
4‐ mer
T A C G
Flow
Order
Flowgram
3‐mer
TTCTGCGAA
2‐mer
1‐mer
Key sequence = TCAG for signal calibration CSB2008
August
2008
UCSC
Sequencing
Center
GS
FLX
Data
Analysis
Overview Image capture
Image processing
Signal processing
GS Run Browser GS De Novo Assembler GS Reference Mapper
GS Amplicon Variant Analyzer
CSB2008
August
2008
UCSC
Sequencing
Center
GS
FLX
System
Performance
Read Length
CSB2008
August
2008
UCSC
Sequencing
Center
The
Genome
is
comprised
of
repeat
regions
• Depending
upon
the
specific
genome
characterisJcs,
microreads
(~25
bp’s)
cover
only
a
porJon
of
the
genome
– In
human
–
25
base
pair
reads
can
only
be
mapped
uniquely
to
80%
of
the
genome
• Short
reads
are
limiJng
in
known
genomes
about
unknown
genomes?
– Mapping
versus
de
novo
assemblies
– Mapping
will
miss
genome
rearrangements
– Mapping
is
only
as
good
as
the
reference
CSB2008
August
2008
UCSC
Sequencing
Center
–
What
Why
Does
Length
MaVer?
•
Longer
sequencing
reads
mean
more
applicaJons
– – – – – –
IdenJfy
and
characterize
small
and
short
RNA’s
Full
length
cDNA
sequencing
for
expression
levels
and
variaJons
Amplicon
resequencing
for
geneJc
variaJon
including
somaJc
mutaJons
Sequencing
of
micro‐organisms
in
a
single
instrument
run
Sequencing
of
complex
genomes
–
mammalian
&
plant
Sequencing
of
complex
samples
–
Metagenomics,
Ancient
DNA
CSB2008
August
2008
UCSC
Sequencing
Center
Longer
sequencing
reads
mean
more
applica8ons
– HIV
Studies
(3)
– ChIP‐Sequencing
(8)
• Boyle
et
al,
Cell:
Mixed
technologies
for
mapping
open
chromaJn
– Metagenomics
(12)
• Palacios
et
al,
New
England
Journal
of
Medicine,
Pathogenic
Virus
DetecJon
– Whole
Genome
Sequencing
(30)
• Velasco
et
al,
PLoS:
Pinot
Noir
Genome
– Paired‐End
sequencing
• DetecJng
Structural
VariaJons
across
two
human
genomes
– Technology
and
BioinformaJcs
(11)
• Meyer
et
al,
NAR,:
Using
Picogram
quanJJes
of
sample
– Transcriptome
studies
–
cDNA
(17)
– Small
RNA
(32)
– Amplicon
and
MethylaJon
Studies
(9)
CSB2008
August
2008
UCSC
Sequencing
Center
ApplicaJons
of
Whole
Genome,
Ultra
Broad
and
Ultra
Deep
HT‐
Sequencing
HT‐
Sequencing
Technology
Applica8ons
Whole
Genome
Sequencing Virus Bacteria Fungus Higher
Eukaryotes Human
CSB2008
August
2008
Ultra
Broad
Sequencing Small
RNA Expression Transcriptome Metagenomics Novel
strain
ID HLA
Typing UCSC
Sequencing
Center
Ultra
Deep Sequencing HIV
Resistance
Tropism Amplicons Popula8on
Biology Bacterial
16S
The
power
of
Metagenomics
•
How
to
IdenJfy
an
environment
based
upon
the
microbial
organisms
that
are
present
– Microbial
PopulaJon
Structures
in
the
Deep
Marine
Biosphere
• Huber
et
al.,
Science,
318,
p97,
2007
•
Determining
the
state
of
an
environment
based
upon
the
presence
and
mixture
of
microbial
organisms
– The
interdependence
of
Coral
and
it’s
microbial
environment
• Wegley
et
al.,
Environmental
Microbiology,
9,
p2707,
2007
•
DetecJng
viral
pathogens
–
quickly
and
accurately
– Less
than
12
months
from
first
idenJficaJon
of
affected
hives
to
possible
pathogen
• Cox‐Foster
et
al.,
Science,
2007
– Transplant
vicJms
from
Australia
• Palacios
et
al,
New
England
Journal
of
Medicine,
2008
CSB2008
August
2008
UCSC
Sequencing
Center
Transcriptome
Analysis
Workflow Comparison GS FLX (clonal sequencing ensured through emPCR) emPCR cDNA libraries (short tag library, EST library)
Time: Days Sequencing
Sanger (E. coli cloning, often concatemerization) cDNA libraries (short tag library, EST library)
Concatemerization, insert fragments into vectors and clone into bacteria
CSB2008
August
2008
Sequencing
Grow, pick colonies
Template Generation
UCSC
Sequencing
Center
Time: Weeks
• Sequencing
of
approximately
400,000
small
RNAs
from
C.
elegans
• Another
18
unknown
miRNA
genes
were
detected
• Thousands
of
endogenous
siRNAs
acJng
preferenJally
on
transcripts
associated
with
spermatogenesis
and
transposons
were
idenJfied
• A
new
class
of
small
RNAs
was
idenJfied:
21U‐ RNAs.
They
all
begin
with
an
U
and
are
precisely
21
nt
long.
CSB2008
August
2008
UCSC
Sequencing
Center
MulJplex
IdenJfier
Basics •
What
is
it?
– Two
new
kits,
each
with
6
different
library
adapters
(total
of
12
adapters)
– Each
MID
library
adapter
has
an
added,
specially
encoded
10‐base
region
– Used
to
“bar‐code”
up
to
12
different
genomic
library
samples
to
be
run
in
the
same
region
of
a
single
sequencing
run
Standard
Library
Seq.
primer
Read
Primer
A
Key
#bases:
40
4
MID
Library
Seq.
primer
Library
fragment
Primer
B
Read
Primer
A
Key
MID
1 #bases:
15
4
10
Library
fragment
Primer
B
Primer
A
Key
MID
2
Library
fragment
Primer
B
Primer
A
Key
MID
n
Library
fragment
Primer
B
CSB2008
August
2008
UCSC
Sequencing
Center
Paired‐End
ApplicaJons
• ~100
bp
sequencing
tags
separated
by
3
kb
spacing
• Use
for
de
novo
assembly
– Order
conJgs
• Use
for
Structural
VariaJon
studies
– Inversions,
DeleJons,
InserJons…
– High
resoluJon
detecJon
–
3kb
spacing
vs
10
to
40
kb
CSB2008
August
2008
UCSC
Sequencing
Center
Paired‐Ends
workflow
CSB2008
August
2008
UCSC
Sequencing
Center
Targeted
Enrichment
of
Human
gDNA
Exon
1
Exon
2
Exon
3
Exon
4
Exon
5
gDNA
Fragment
and
hybridize
to
NimbleGen
capture
array
Elute
Analyze
Exon
Sequences
CSB2008
August
2008
UCSC
Sequencing
Center
HT‐Sequencing
Sequencing
all
the
known
exons
from
the
human
genome
• “Direct
selecJon
of
human
genomic
loci
by
microarray
hybridizaJon,”
– Albert
et
al.,
Nature
Methods,
(4)
11,
903
‐905,
2007
• ~6,700
gDNA
loci
selected
• BRCA1
region
2 MB Region
CSB2008
August
2008
UCSC
Sequencing
Center
Another
Sequence‐Capture
Example
• 19
Kb
region
from
Chromosome
4
Targeted
Exons
Seq‐Cap
Array
Probes
GS
FLX
Seq
Reads
Sequencing
Coverage
CSB2008
August
2008
UCSC
Sequencing
Center