Light strand 5'TAGACGTCATTCAGGTGCCAAACGACTTAACGGGATTAC. He av y st ra nd. 5'. GT. AA. TC. CC. GT. TA. AG. TC. GT. TT. GG. CA. CC. TG. AA. TG.
Figure S1: Summary of the emulsion PCR (emPCR) process Light strand 5‘TAGACGTCATTCAGGTGCCAAACGACTTAACGGGATTAC ATCTGCAGTAAGTCCACGGTTTGCTGAATTGCCCTAATG‘5 dnarts yvaeH 1a) Original template molecules are fragmented and denatured TAGACGTCATT
AACGGGATTAC
CAGGTGCCAAACGACTT
ATCTGCAGTAA
TTGCCCTAATG
GTCCACGGTTTGCTGAA
GTCCACGGTTTGCTGAA
A
1b) Subsequent ligation and purification steps leave original, single-stranded template molecules bound to ligator sequences ‘A’ and ‘B’
B
1c) These single-stranded molecules are captured by oligonucleotide probes complementary to ligator ‘A’ that are bound to the emPCR beads.
GTCCACGGTTTGCTGAA CAGGTGCCAAACGACTT
B
1d) During the first round of emPCR, a complementary molecule to the captured original template molecule is synthesised from the probe
B
AAC
GAC
TT T
B
B CT
CCA
GA
GTG
AC
CAG
B
GA
A
A
B
GTCCACGGTTTGCTGAA CAGGTGCCAAACGACTT
GTC CAGGCACGGT TGCC TTGC AAAC TGAA GACT T
G CA TC GG CA TG CG CC GT AA TT AC GC GA TG CT AA T
B
B
B
CC
A CA
AA
B
AC
A
CAGGTGCCAAACGACTT
B
A
A
T CT
A
G C TC A GGCAC TG GG CC TT A T A ACGCT GA GA CT A T
AA
GCA
TTG
GT GTC
G CCT
A
A
1e) Decendent molecules from the emPCR process are bound to the emPCR bead by the AAGTCGTTTGGCACCTG remaining probes
A GA CT T TG CT TT ACGA G G AC AA CC CC GT GGTG CA
A
AA TG TT GC GAC T TT AC GG AA AC GCC C C T GT AGG C
B
GT CA CCA GG CG TG GT CC TT AA GC AC TG GA AA CT T
B
A
GTCCACGGTTTGCTGAA
A
A
A
CA
G CA
GG
GT
GC
TG
B
G
CA
A A
G
AC
AA
CC
G GT
T
T AC
B
CAGGTGCCAAACGACTT
CAGGTGCCAAACGACTT
A
A CAGG
TGCC
A
A
AAAC
CAG
GTG
A
GACT
CA GG T
CA
GG
TG
CCA
AAC
GAC
TT B
TT
AC
AA CG
CA
GC
CC
AA
AC
GA
CT
T
B
T
B
1f) The captured DNA fragments dissasociate during further processing steps. Only the captured-single stranded molecules, which are the complement of the original single-stranded template molecule remain. These form the template for subsequent pyrosequencing (See Fig. S2)
B
Fig. S2: Pyrosequencing of captured emPCR products and subsequent identification of strand of origin of the original template molecule 2a) Following emPCR captured single-stranded DNA undergoes pyrosequencing using primers that are complementary to the 3’ ligator sequence 5‘
CAGGTGCCAAACGACTT
5‘
5‘
AAGTCGTTTGGCACCTG 5‘
2b) Generated sequence is the reverse complement of the captured single-stranded DNA molecule
2c) When aligning the sequence to a reference sequence (here a mitochondrial L strand): 2c(i) Sequence generated from a captured L (red) strand must be reverse complemented for alignment 2c(ii) Sequences generated from a captured H (blue) strand can be directly aligned to the reference sequence 2d) Therefore: During data analysis the original orientation of the template molecule can be unambigously determined by assessing whether the sequence can be directly aligned or not 2e) Furthermore, as the captured single-stranded DNA molecules are the complement of original single-stranded molecules (see Fig. 1): 2e(i) Sequence generated from a captured L (red) strand ultimately originate from an original single-stranded L strand molecule 2e(ii) Sequence generated from a captued H (blue) strand ultimately originate from an original Single-stranded H strand molecule 2f) This statement can be generalised to state: DNA sequences generated through sequencing-by-sythesis directly describe the original single-stranded template molecule
Figure S3: Identification of DNA miscoding lesions in sequence data 3a) Captured single-stranded DNA molecule post emPCR process (as Fig. S2a)
5‘
3b) Captured single-stranded DNA molecule containing C T miscoding lesion post emPCR process. The captured molecule is the complement of an original single-stranded molecule.Therefore the observed miscoding lesion derives from an original G A damage event. 5‘
CAGGTGCCAAACGACTT
CAGGTGCT TAAACGACTT
3c) Captured molecules undergo pyrosequencing (see Fig. S2). A G A change is observed when the sequence derived from the damaged molecule (Fig. S3b) is compared to the sequence derived from the undamaged (Fig. S3a) molecule. 5‘
CAGGTGCCAAACGACTT GTCCACGGTTTGCTGAA‘5
5‘ 5‘
CAGGTGCT TAAACGACTT GTCCACGATTTGCTGAA
5‘
3d) As the generated sequence directly describes the original single-stranded template molecule (Fig. S2f), the observed G A miscoding lesion accurately describes the original G A damage event.