Chapter 18

3 downloads 0 Views 363KB Size Report
Discrimination of Pseudogene and Parental Gene DNA. Methylation Using ..... the two previously complementary DNA strands become non- complementary ...
Chapter 18 Discrimination of Pseudogene and Parental Gene DNA Methylation Using Allelic Bisulfite Sequencing

1

2 3

Luke B. Hesson and Robyn L. Ward

4

Abstract

5

Determining the methylation status of genes with pseudogenes can be technically challenging due to sequence homology. High sequence homology can result in the amplification of both pseudogene and parental gene alleles, potentially leading to data misinterpretation. Allelic bisulfite sequencing allows for detection of the methylation status of individual alleles at nucleotide resolution and represents the most reliable method for discriminating pseudogene and parental gene sequences. Here, we discuss important points that should be considered when investigating pseudogene and parental gene methylation status and we describe the method of allelic bisulfite sequencing, including assay design. Key words Pseudogene, Epigenetics, Methylation, Bisulfite, Sequencing

1  Introduction

6 7 8 9 10 11 12 13

14

Pseudogenes are ancestral nonfunctional copies of protein coding genes that have lost the potential to give rise to a protein product [1]. Pseudogenes that arise by genomic duplication are called unprocessed pseudogenes, whereas those formed by retrotransposition through reverse transcription of an mRNA intermediate and reintegration into the genome are known as processed. Pseudogenes are not restricted by the same selective pressures as functional parental genes and accumulate deleterious sequence changes over time, usually resulting in stop codons that render the “open reading frame” nonfunctional. Pseudogenes are ubiquitous in the human genome with current estimates indicating that there are over 17,000 pseudogenes [2]. Given the close similarity between pseudogenes and almost all coding genes, it is challenging to develop molecular analyses that are specific for the gene of interest rather than pseudogenes [3]. Most notably, amplification of DNA sequences using PCR can be problematic if the target region is not unique in the genome. Pseudogenes with high sequence homology can therefore be a Laura Poliseno (ed.), Pseudogenes: Functions and Protocols, Methods in Molecular Biology, vol. 1167, DOI 10.1007/978-1-4939-0835-6_18, © Springer Science+Business Media New York 2014

15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

Luke B. Hesson and Robyn L. Ward

source of “nonspecific” amplification when investigating gene expression, mutations, or DNA methylation [4, 5]. Both processed and unprocessed pseudogenes often show high sequence homology with the promoter regions of parental genes. Promoter regions are usually the focus of attention when assaying for methylation changes. In a recent study of the methylation status of pseudogene–parental gene pairs, Cortese et al. found that the majority of pseudogenes were methylated in different tissues compared with parental genes [6]. Therefore, when investigating the promoter methylation status of genes with pseudogenes, it is essential that the assays used are able to reliably discriminate pseudogene and parental gene sequences in order to avoid data misinterpretation. Recently, we have demonstrated the technical challenges associated with analyzing the methylation status of the PTEN CpG island promoter, which shows >95 % sequence homology with the 5′ region of the PTENP1 pseudogene [7]. Using allelic bisulfite sequencing, we were able to unequivocally demonstrate that methylation of the PTEN CpG island is a rare event in cancer cell lines and that apparent methylation in fact originates from homologous regions of the PTENP1 pseudogene [7]. Allelic bisulfite sequencing involves bisulfite PCR, bacterial cloning of PCR amplicons, and fluorescent automated DNA sequencing of individual alleles. Here, we describe a methodological approach to determine the methylation status of genes with pseudogenes or regions sharing high sequence similarity, using allelic bisulfite sequencing.

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58

59

2  Materials 2.1  In Silico Characterization

1. Genome browser [8].

62

2.2  Sodium Bisulfite DNA Modification

1. EZ DNA methylation Gold Kit (Zymo Research).

63

2.3  Bisulfite PCR and PCR Purification

1. Thermocycler.

60 61

2. Sequence alignment tool such as BLAT [8] or BLAST [9].

65

2. Platinum®Taq DNA polymerase complete with 10× PCR ­buffer and 50 mM MgCl2 (Invitrogen).

66

3. dNTP mixture.

64

68

4. Primers for amplification of region of interest from bisulfite modified DNA.

69

5. PCR purification kit or Gel extraction kit.

67

70 71 72

2.4  Ligation and Transformation of PCR Products

1. pCR®2.1-TOPO® TA Cloning vector kit (Invitrogen). 2. Luria–Bertani (LB) agar plates supplemented with 50 μg/mL Carbenicillin, 80 μg/mL 5-bromo-4-chloro-3-indolyl-β-d-­

Pseudogenes and Allelic Bisulphite Sequencing

galactopyranoside (X-β-gal), and 500 μM Isopropyl-B-d-­ thiogalactopyranoside (IPTG, BDH). 3. Water bath set to 42 °C.

2.7  Fluorescent Automated DNA Sequencing

78

6. Orbital shaker.

80

7. Incubator set to 37 °C.

81

8. Chemically competent DH5α E. coli.

82

1. Standard PCR reagents and Subheading 2.3, items 1–3.

equipment

as

listed

in

83 84 85 86

3. 96-well PCR reaction plates.

87

1. Thermocycler.

88

2. Antarctic phosphatase, 5,000 U/mL complete with 10× Antarctic phosphatase buffer.

89 90

3. Exonuclease I, 20,000 U/mL.

91

1. Thermocycler.

92

2. BigDye Terminator v3.1 Cycle Sequencing Kit complete with 5× reaction buffer (Applied Biosystems). ®

93 94

3. Ethanol (95 % and 70 % (v/v)).

95

4. 3 M Sodium Acetate (pH 5.2).

96 97 98

6. ABI3730 DNA analyzer (Applied Biosystems).

99

1. DNA sequence viewing software.

100

2. “CpGviewer” interactive bisulfite DNA sequencing analysis tool [10] or equivalent bisulfite DNA sequencing analysis software.

3  Methods 3.1  In Silico Characterization

77

79

5. Refrigerated centrifuge with a plate spinning rotor capable of 2,235 RCF.

2.8  Data Interpretation

76

5. Ice box.

2. M13 sequencing primers (5′-GTTTTCCCAGTCACGAC-3′ and 5′-CAGGAAACAGCTATGAC-3′).

2.6  Phosphatase and Exonuclease Treatment

74 75

4. Super Optimal broth with Catabolite repression (SOC) media: 2 % (w/v) Bacto-Tryptone, 0.5 % (w/v) yeast extract, 10 mM NaCl, 2.5 mM KCl and 10 mM MgCl2, 20 mM glucose.

2.5  Colony PCR

73

101 102

103

1. Obtain the sequence of the CpG island promoter or of other regions of interest. 2. Using a sequence alignment tool, search for regions of homology in the genome (Fig. 1a). 3. Identify the sequence differences between the pseudogene and the parental gene (Fig. 1b).

104 105 106 107 108 109

Luke B. Hesson and Robyn L. Ward

a KLLN

500 bases

PTEN CGI Homologous region in PTENP1

Bisulphite PCR

b 1 kb

PTENP1

Bisulphite PCR

Bisulphite PCR

PTEN PTENP1

100 bases

CCTCCAGC CCGCCGGC

CGGACGAGA CTCCAT CGCACGGGA CTGGAT

GCC---------GCC GCCGCCGCCGCCGCC

Fig. 1 Regions of homology between the PTEN and PTENP1 pseudogene and identification of discriminating sequence differences. (a) The PTEN CpG island (green bar) is a bidirectional promoter encompassing the 5′UTR of the KLLN and PTEN genes (blue bars). The fragmented black bar indicates the region that shares high sequence homology with the PTENP1 pseudogene and the degree of similarity. Thin vertical red lines indicate single nucleotide differences, whilst gaps indicate larger regions of sequence variation between the PTEN and PTENP1 genes across this region. The bracket indicates the region amplified using bisulfite PCR primers as described in Bennett et al. [11]. (b) Shown is the region of the PTENP1 processed pseudogene (light blue bar ) that is amplified by the bisulfite PCR described in (a). Thin vertical red lines and gaps within the black bar indicates the locations of sequence differences (bottom) that can be used to distinguish PTEN alleles from PTENP1 alleles

4. Design bisulfite PCR primers that amplify regions that contain informative sequence differences between the parental gene and the pseudogene (see Notes 1–4 for hints on bisulfite PCR primer design).

110 111 112 113

118

Extract genomic DNA from the cell line or tissue of interest using phenol-chloroform DNA extraction or a commercially available kit. Extracted genomic DNA is then bisulfite modified, which involves the selective chemical conversion of cytosine to uracil, whereas 5-methylcytosine remains refractory to this conversion.

119

1. Dilute 1 μg of genomic DNA into 20 μL in nuclease-free water.

114 115 116 117

120 121 122

3.2  Sodium Bisulfite DNA Modification

2. Prepare the “CT conversion reagent” (provided in the EZ DNA methylation-Gold™ kit) according to manufacturer’s instructions and add 130 μL to the DNA.

Pseudogenes and Allelic Bisulphite Sequencing

3. Place the reaction tube in a thermocycler and incubate at 98 °C for 10 min followed by 53 °C for 18 h. This extended incubation time ensures the complete modification of DNA. 4. Recover the modified DNA using the DNA binding columns provided in the EZ DNA methylation-Gold™ kit according to the manufacturer’s instructions. Elute the modified DNA in 50 μL nuclease-free water to obtain ~20 ng/μL bisulfite modified DNA. 3.3  Bisulfite PCR and PCR Purification

1. The optimum conditions for each PCR must be determined empirically. We commonly use the following conditions when optimizing bisulfite PCRs: 0.2 mM dNTPs, 0.4–1 μM each primer, 0.5–1 U Platinum®Taq DNA polymerase, 2 mM MgCl2, and 40–100 ng bisulfite modified DNA. The thermocycle consists of 5 min at 95 °C, 35–40 cycles of 1 min at 94 °C, 1 min at the calculated annealing temperature of the primers and 2 min at 74 °C, followed by 10 min and 72 °C. 2. Purify PCR amplicons using a PCR purification or gel extraction kit (see Note 5).

3.4  Ligation and Transformation of PCR Products

1. Perform ligation into the pCR®2.1-TOPO® TA cloning vector according to the manufacturer’s instructions. Ligation is performed for 30 min at room temperature. 2. Thaw DH5α E. coli on ice. Aliquot 50 μL into a fresh 1.5 mL tube and add 2 μL of ligation reaction. Gently mix with the pipette tip. Do not vortex or pipette up and down. Incubate on ice for 30 min. 3. Transform the bacteria by heat shock in a 42 °C water bath for 30 s. Incubate on ice for 2 min. 4. Add 450 μL room temperature SOC media. 5. Incubate in an orbital shaker for 90 min at 37 °C, shaking at 250 rpm. 6. Evenly spread 100 μL of each transformation mixture onto the surface of a pre-warmed LB agar plate (see Note 6). 7. Incubate at 37 °C for 20 h.

3.5  Colony PCR

1. Remove the LB agar plate from the incubator and mark the desired number of white colonies for colony PCR. 2. Place 5 μL nuclease-free water into the bottom of the desired number of wells within a 96-well PCR plate, plus one additional well for a negative control (see Note 7). 3. To inoculate the water, gently touch a white colony using a pipette tip and place it in a well within the 96-well plate. Leave the tip in the well and continue picking colonies until the desired number is reached (see Note 8).

123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164

Luke B. Hesson and Robyn L. Ward

4. Gently shake the plate to agitate the tips. This will disperse the bacteria cells into the water. To remove the tips without ­cross-­contaminating wells invert the plate over a waste disposal bin.

165 166 167 168

5. Prepare a PCR master mix containing 0.25 mM dNTPs, 2 mM MgCl2, 1 U Platinum® Taq DNA polymerase, and M13 sequencing primers (0.4 μM each primer, see Note 9).

169 170 171

6. Place 20 μL PCR master mix into each well to obtain a final volume 25 μL. Seal the plate and place it in a thermocycler.

172 173

7. Incubate for 5 min at 95 °C followed by 30 cycles of 30 s at 95 °C, 30 s at 50 °C, 30 s at 72 °C, and a final incubation for 10 min at 72 °C (see Note 10).

174 175 176

8. To identify reactions ready for sequencing, load 5 μL of each reaction into a 2 % agarose gel (see Note 11).

177 178 179 180 181

3.6  Phosphatase and Exonuclease Treatment

182

2. Incubate in a thermocycler for 30 min at 37 °C followed by 20 min at 80 °C.

183 184 185 186 187

1. Add 0.5 μL (2.5 U) alkaline phosphatase, 0.5 μL (10 U) exonuclease I, 2.5 μL 10× alkaline phosphatase buffer, and 1 μL nuclease-free water (final volume 25 μL, see Note 12) to each reaction.

3.7  Fluorescent Automated DNA Sequencing

188 189

1. Remove 4 μL of each reaction and place it in a fresh PCR plate. 2. Add 0.5  μL BigDye® Terminator v3.1 Cycle Sequencing reagent, 2 μL 5× BigDye sequencing reaction buffer, 1 μL of 3.2 μM primer (see Note 13), and 2.5 μL nuclease-free water (final volume 10 μL). Place the reaction in a thermocycler. 3. Incubate for 25 cycles of 20 s at 94 °C, 20 s at 50 °C and 4 min at 60 °C. Store reactions at 4 °C and protect from light.

190 191

193

4. Add 25 μL 95 % (v/v) ethanol (chilled on ice) and 1 μL 3 M Sodium Acetate (pH 5.2) to each well.

194

5. Seal the plate and place it in a 4 °C centrifuge.

195

6. Centrifuge at 2,235 RCF for 20 min.

192

197

7. Remove ethanol and add 50 μL 70 % (v/v) ethanol. Repeat steps 2–4.

198

8. Remove the 70 % (v/v) ethanol and air-dry for 10 min.

199

9. Sequence using an ABI3730 DNA analyzer.

196

200 201 202 203 204 205

3.8  Data Interpretation

1. Using DNA sequence viewing software separate pseudogene and parental gene alleles based on sequence differences identified in step 3 in Subheading 3.1. 2. Determine the methylation status of each individual CpG dinucleotide using the “CpGviewer” software [10] (Fig. 2). To use this software, the genomic sequence of the regions analyzed

[AU1]

Pseudogenes and Allelic Bisulphite Sequencing

a

PTEN-derived allele

PTENP1-derived allele

b

PTEN-derived allele (unmethylated)

PTENP1-derived allele (methylated)

Fig. 2 Allelic bisulfite sequencing shows DNA methylation is specifically associated with PTENP1 and not the PTEN CpG island. (a) Allelic bisulfite sequencing data showing the methylation status of PTEN and PTENP1-derived alleles in the hematological cancer cell line Raji (taken from Hesson et al. [7]). Each line represents a single allele. Circles indicate the positions of CpG dinucleotides; black circles indicate methylated CpG dinucleotides, white circles indicate unmethylated CpG dinucleotides; yellow diamonds indicate the positions of nucleotide variations within PTENP1 alleles used to discriminate between PTEN and PTENP1 alleles; black diamonds indicate the positions of additional CpG dinucleotides specific to PTENP1 alleles that were also methylated. (b) Representative electropherograms showing an unmethylated PTEN allele and a methylated PTENP1 allele. Indicated by the black arrow is the position of a nucleotide variation used to discriminate the PTEN and PTENP1 alleles

Luke B. Hesson and Robyn L. Ward

(both the pseudogene and the parental gene) are required in plain text (.txt format), as well as the electropherogram for each allele (.ab1 format).

206 207 208

209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248

4  Notes 1. Choice of the DNA strand. Following bisulfite modification, the two previously complementary DNA strands become noncomplementary single-stranded DNAs that can be amplified separately using strand-specific PCR primers. Once the region of interest has been identified, it is crucial to choose the most appropriate DNA strand, so that the sequence differences between the parental gene and the pseudogene remain informative even after the treatment with sodium bisulfite. For example, C/T mismatches between the parental gene and the pseudogene (with the C unmethylated) are lost following bisulfite conversion, but the corresponding G/A mismatches present on the other strand do persist. 2. Basic principles of bisulfite PCR primer design. It is essential that bisulfite PCR primers are specific for bisulfite modified DNA. To achieve this goal, we design them so that they include thymines (originally cytosines) at critical positions, such as a thymine (originally cytosine) at the most 3′ base or a short stretch of thymines (originally cytosines) in the central part of the primer. Bisulfite modification reduces the complexity of the DNA sequence, making it more difficult to design primers with a low rate of off-target binding. It is therefore important to incorporate a mix of the three remaining bases A, T, and G, whenever possible. In this respect, increasing primer length (between 25 and 40 nt) increases primer specificity and also compensates for reduced primer annealing temperature due to the loss of cytosine bases. For allelic bisulfite sequencing, amplification of modified DNA with no bias towards original methylation status is also crucial. This is achieved by avoiding CpG dinucleotides within primer binding sites. 3. Amplicon size and nested PCRs. Generally bisulfite PCR works well with small amplicon sizes (up to ~500 bp). Nested PCRs can improve the amplification of particularly large amplicons or regions for which it is difficult to obtain specific amplicons. Nested bisulfite PCRs are split into two reactions, the first of which involves a limited number of cycles (~20). A small amount of this reaction is then transferred to a second PCR reaction, which includes nested primers.

Pseudogenes and Allelic Bisulphite Sequencing

4. When analyzing the methylation status of any gene with a pseudogene that shares high sequence homology, we advise against the use of other techniques such as bisulfite pyrosequencing, methylation-specific PCR (MSP) or combined bisulfite restriction analysis (COBRA). These techniques may not be informative for the proportion of the amplicon that is pseudogene-derived and/or may not allow for discrimination of whether methylation originates from the pseudogene or the parental gene. 5. Ligation efficiency is improved by PCR purification. This can be done using PCR column purification or gel extraction. Gel extraction is desirable if the reaction contains significant primer dimer or nonspecific PCR products. 6. If low numbers of transformants are expected, then the entire transformation mixture can be plated following collection of cells by centrifugation and resuspension in 100 μL SOC media. 7. Each white colony should contain a single allele of the region of interest. The number of colonies to be screened depends largely on the application. Sequencing of a greater number of alleles will give a more accurate representation of the methylation status of a region across a population of cells, as well as of the proportion of methylated pseudogene and parental gene-­ derived sequences. 8. When picking the colonies, avoid scraping the entire colony as this will overload the PCR reaction with too much DNA. If a plasmid miniprep containing a cloned allele is required, the remainder of the colony can be used to inoculate LB broth containing 50 μg/mL carbenicillin. 9. The use of M13 primers standardizes colony PCR conditions but also ensures that different primers can be used for colony PCR and sequencing reactions, which reduces background in sequencing reactions. 10. The initial incubation for 5 min at 95 °C is essential to activate Platinum®Taq DNA polymerase and also releases plasmid DNA from bacteria. 11. Colony PCR product size is defined by the size of the insert ligated into the pCR®2.1-TOPO® TA cloning vector plus ~200 bp of flanking vector sequence. 12. Prior to sequencing, unincorporated dNTPs and primers must be removed from the PCR reaction. This can be done enzymatically or through purification columns. We recommend enzymatic treatment using antarctic phosphatase and exonuclease I, which dephosphorylates dNTPs and removes single-­stranded DNA primers, respectively. These enzymes are then heat inactivated, thereby preventing interference

249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293

Luke B. Hesson and Robyn L. Ward

with subsequent sequencing. The use of these enzymes allows for a more convenient high-throughput sequencing of individual alleles in a 96-well format.

294 295 296

13. Add only one primer from the set used to obtain the original PCR amplicon.

297 298

299

References

300 301 302 303 304

1. Zheng D, Frankish A, Baertsch R et al (2007) Pseudogenes in the ENCODE regions: consensus annotation, analysis of transcription, and evolution. Genome Res 17: 839–851 2. Karro JE, Yan Y, Zheng D et al (2007) Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation. Nucleic Acids Res 35:D55–D60 3. Kalyana-Sundaram S, Kumar-Sinha C, Shankar S et al (2012) Expressed pseudogenes in the transcriptional landscape of human cancers. Cell 149:1622–1634 4. Whang YE, Wu X, Sawyers CL (1998) Identification of a pseudogene that can masquerade as a mutant allele of the PTEN/ MMAC1 tumor suppressor gene. J Natl Cancer Inst 90:859–861 5. Zysman MA, Chapman WB, Bapat B (2002) Considerations when analyzing the methylation status of PTEN tumor suppressor gene. Am J Pathol 160:795–800

305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321

6. Cortese R, Krispin M, Weiss G et al (2008) DNA methylation profiling of pseudogene-­ parental gene pairs and two gene families. Genomics 91:492–502 7. Hesson LB, Packham D, Pontzer E et al (2012) A reinvestigation of somatic hypermethylation at the PTEN CpG island in cancer cell lines. Biol Proced Online 14:5 8. Meyer LR, Zweig AS, Hinrichs AS et al (2012) The UCSC Genome Browser database: extensions and updates 2013. Nucleic Acids Res 41(Database issue):D64–D69 9. Altschul SF, Gish W, Miller W et al (1990) Basic local alignment search tool. J Mol Biol 215:403–410 10. Carr IM, Valleley EM, Cordery SF et al (2007) Sequence analysis and editing for bisulphite genomic sequencing projects. Nucleic Acids Res 35:e79 11. Bennett KL, Mester J, Eng C (2010) Germline epigenetic regulation of KILLIN in Cowden and Cowden-like syndrome. JAMA 304:2724–2731

322 323 324 325 326 327 328 329 330 [AU2] 331 332 333 334 335 336 337 338 339 340 341 342 343