Site-specific integration and tailoring of cassette

2 downloads 0 Views 2MB Size Report
Aug 21, 2011 - emerging knowledge on sequence variation among genomes10 .... BGHpA: poly(A) site from the bovine growth hormone gene. (b) EGFP+ cells ...
Articles

Site-specific integration and tailoring of cassette design for sustainable gene transfer

© 2011 Nature America, Inc. All rights reserved.

Angelo Lombardo1,2, Daniela Cesana1,2,7, Pietro Genovese1,2,7, Bruno Di Stefano3,6,7, Elena Provasi2,4,7, Daniele F Colombo1,2,7, Margherita Neri1,2, Zulma Magnani4, Alessio Cantore1,2, Pietro Lo Riso1,2, Martina Damo1,2, Oscar M Pello1, Michael C Holmes5, Philip D Gregory5, Angela Gritti1, Vania Broccoli3, Chiara Bonini4 & Luigi Naldini1,2 Integrative gene transfer methods are limited by variable transgene expression and by the consequences of random insertional mutagenesis that confound interpretation in gene-function studies and may cause adverse events in gene therapy. Site-specific integration may overcome these hurdles. Toward this goal, we studied the transcriptional and epigenetic impact of different transgene expression cassettes, targeted by engineered zinc-finger nucleases to the CCR5 and AAVS1 genomic loci of human cells. Analyses performed before and after integration defined features of the locus and cassette design that together allow robust transgene expression without detectable transcriptional perturbation of the targeted locus and its flanking genes in many cell types, including primary human lymphocytes. We thus provide a framework for sustainable gene transfer in AAVS1 that can be used for dependable genetic manipulation, neutral marking of the cell and improved safety of therapeutic applications, and demonstrate its feasibility by rapidly generating human lymphocytes and stem cells carrying targeted and benign transgene insertions.

Integrative gene transfer is widely used to study gene function, confer new properties on cells or organisms and, in the clinical setting, to correct disease. However, the use of conventional ­vectors that integrate semirandomly throughout the genome has several limitations. Because transgene expression is influenced by the integration site, expression varies and silencing occurs among transduced cells1. Some insertions may disrupt genes or perturb their transcription, altering the biological properties of the transduced cell. This may occur when insertion activates a proto-oncogene and triggers cell transformation, as has been reported in some gene-therapy applications2. Site-specific integration may overcome these hurdles. We and others have previously described the use of engineered zincfinger nucleases (ZFNs) to target gene transfer3–6. A ZFN-induced

DNA double-strand break at a predetermined site of the genome can trigger homology-directed repair, a pathway exploited to insert a transgene into the ZFN target site from an exogenous template. With this approach, exogenous DNA sequences can be delivered into specific loci for comparative or subtractive gene function ­studies without the confounding influences of insertion sites and to replace malfunctioning genes7,8. However, the choice of a suitable genomic acceptor site for wide application in gene transfer and the optimal design of the transgene cassette to ensure robust expression without perturbing nearby endogenous transcription remain to be investigated. An ideal site for transgene insertion should allow (i) robust and stable transgene expression across different cell types; (ii) no transcriptional perturbation owing to the transgene cassette and (iii) no disruption of essential regulatory or coding sequences due to the transgene cassette. Such sites may be identified in model organisms such as the mouse from data on established transgenic lines9, although these data might not apply to the human ­system. In humans, such sites may be sought by ­combining emerging knowledge on sequence variation among genomes10 and on ­clinically silent homozygous gene deficiencies with gene expression atlases11, and possibly from the few identified functional vector insertions associated with a benign outcome in gene therapy clinical trials12,13. We and others have previously investigated the potential suitability for transgene insertion at two human genomic locations: (i) a common integration site of the human non-pathogenic adeno-associated virus (AAV), found between exon 1 and intron 1 of protein phosphatase 1 regulatory subunit 12C (PPP1R12C) gene, known as AAV site 1 (AAVS1)14 and (ii) the HIV co-­receptor chemokine (C-C motif) receptor 5 (CCR5) gene, for which a homozygous deletion is found in apparently healthy individuals15. We previously described efficient integration into CCR5 mediated by ZFNs targeting exon 3 of the gene, resulting in stable transgene

1San Raffaele Telethon Institute for Gene Therapy, Division of Regenerative Medicine, Gene Therapy and Stem Cells, San Raffaele Institute, Milan, Italy. 2Vita-Salute

San Raffaele University, Milan, Italy. 3Stem Cell and Neurogenesis Unit, Division of Neurosciences, San Raffaele Institute, Milan, Italy. 4Experimental Hematology Unit, Division of Regenerative Medicine, Gene Therapy and Stem Cells, Program in Immunology and Bio-immunotherapy of Cancer, San Raffaele Scientific Institute, Milan, Italy. 5Sangamo BioSciences Inc., Richmond, California, USA. 6Present address: Hematopoietic Stem Cell Biology and Differentiation Group, Department of Differentiation and Cancer Centre for Genomic Regulation, Barcelona, Spain. 7These authors contributed equally to this work. Correspondence should be addressed to L.N. ([email protected]). Received 8 February; accepted 29 July; published online 21 August 2011; doi:10.1038/nmeth.1674

nature methods  |  ADVANCE ONLINE PUBLICATION  |  

Articles

Donor IDLV LTR

EGFP

LTR

ZFNs

CCR5 Exon 1

ZFNs 3

2

Exon 1

* *

EGFP

*

HT-TI

AAVS1

3 kb 1 kb

5′ TI

1 kb

3′ TI

Clone:

EF1A

35 30 25 20 15 10 5 0

BglI

BglI BglI

BglI BglI ZFNs

EGFP

0

2.5

5 7.5 10 + EGFP cells (%)

*

6 kb 5 kb 4 kb 1 kb

NA

1 kb

NA

SFFV

EGFP

EGFP

12 kb 7 kb 5 kb 4 kb

HT-TI

1 kb

5′ TI

15

1 kb

3′ TI

4 kb

HT

NA NA

HT-TI BstXI BstXI TI BstXI BstXI AAVS1 BstXI BstXI ZFNs 5′ TI

Bsu36I

Bsu36I TI

Clone

e

8 kb

Clone:

12.5

SFFV EGFP

**

PGK

Bsu36I Bsu36I

EGFP

SD SA

12 kb BglI

TI

55 27 63 97 57 90 85 92 28 34 88 58 69

Clone

Mock

80

0

AAVS1

CCR5

PGK-EGFP-pA SFFV-EGFP-pA EF1A-EGFP-pA 440 890 240 920 36 140

2

3

4

2

3

4

0 1 0 1 10 10 10 10 10 10 10 10 10 10

0

1

2

3

4

EF1A_intronEGFP-pA 2,000 2,100

0

1

2

3

4

10 10 10 10 10 10 10 10 10 10

Green fluorescence (a.u.)

f

ZFNs CCR5 1

3′ TI

Exon 3

2

1.5 kb 0.9 kb 0.6 kb

AAVS1 locus

Cloned into

2 Figure 1 | Targeted integration and transgene expression in CCR5 and AAVS1 of human lympho­ MFI (a.u.) 75 M1 blastoid cells. (a) Schematics of ZFN-induced targeted integration into the indicated loci. 56 PGK-EGFP-pA = 42,000 PGK-EGFP-pA + 0.6 kb = 43,000 The targeting constructs (donor IDLV) are depicted as reverse-transcribed IDLVs containing the 38 PGK-EGFP-pA + 0.9 kb = 29,000 19 expression cassette flanked by homology sequences (gray lines and boxes) to CCR5 or PPP1R12C. PGK-EGFP-pA + 1.5 kb = 28,000 0 LTR: self-inactivating long terminal repeat. ψ, packaging signal. ZFN cleavage sites are indicated. 10 –10 10 10 10 Green fluorescence (a.u.) BGHpA: poly(A) site from the bovine growth hormone gene. (b) EGFP+ cells (mean ± s.e.m., n = 4–7 for each cassette per locus) one month after transduction with the indicated donor IDLVs, with or without IDLVs expressing the indicated ZFNs. (c,d) Histograms (top) show the EGFP MFI (in arbitrary units; a.u.) of the indicated samples. Southern blots (middle) show targeted integration of the indicated cassettes into CCR5 (c; blot probed for EGFP) or AAVS1 (d; blots probed for AAVS1) in sorted EGFP+ cells and in single cell–derived clones from the donor and ZFNs conditions as in b. In c, vertical dashed line separates samples treated with the indicated cassettes. Schematics (right) show the expected targeted integration (TI) of a single cassette or a donor head-to-tail vector concatemer into the target site (HT-TI), with the restriction sites and probes (colored bars) used for analysis. Most sorted cells and all clones contained mono- or biallelic (*) integration at the target site. PCR analyses (bottom) for the 5′ and 3′ junctions generated by targeted integration of the cassette into the locus (TI) or for the presence of head-to-tail vector concatamers (HT) that can be either targeted or not into the site. NA, not assessed. (e) Representative of three flow cytometry analyses of mock-treated (untransduced cells), or CCR5 or AAVS1 gene-targeted lymphoblastoid cells from b showing the green fluorescence (in arbitrary units; a.u.) driven by the indicated promoters. The EGFP MFI of the indicated cassettes upon targeted integration into CCR5 or AAVS1 is shown in red or blue, respectively. (f) Representative green fluorescence and MFI from PGK-EGFP-pA cassettes targeted into AAVS1 of lymphoblastoid cells together with the indicated DNA fragments from the CCR5 gene. 1

PGK EGFP

Cell counts

© 2011 Nature America, Inc. All rights reserved.

12 kb 8 kb 6 kb 5 kb 4 kb

PGK

EF1A

BGHpA

MFI (× 104 a.u.)

MFI (× 104 a.u.)

15 12 9 6 3 0

EGFP

EGFP

BGHpA

d

EGFP

PPP1R12C Exon 2

EGFP

PGK

SFFV

LTR

EGFP

5 4 3 2 1 0

60 53 23 57 32 29 80 76 81 82 41 44 33 28

LTR

PGK SFFV EF1A EF1A_intron

AAVS1 Donor Donor and ZFNs

So rte d 9 7 12 4 10 2 14 So 13 rte d 2 11 21 7 15 5 22 19

PGK SFFV EF1A EF1A_intron

c

CCR5 Donor Donor and ZFNs

MFI (× 103 a.u.)

b

Cell counts

a

2

expression in human cell lines and embryonic stem cells (ESCs)3. Transgene insertion at AAVS1 has been achieved in mouse and human ESCs or induced pluripotent stem cells (iPSCs) either by the AAV replication protein16,17 or by ZFNs targeting intron 1 of PPP1R12C18,19. These studies had shown transgene expression and maintenance of pluripotency upon AAVS1 targeting. However, the impact of the insertion on the target and flanking genes had not been investigated. Here we address this issue by targeting different expression cassettes into CCR5 or AAVS1 of different human cell types with ZFN technology. We achieved efficient site-specific integration and investigated the transcriptional output and epigenetic response of the cassette and the locus. We show that the choice of the locus and the design of the cassette can be optimized to achieve robust transgene expression without altering endogenous transcription at and around the insertion site. This outcome was reproducible at AAVS1 across multiple transgenes and cell types, thus validating this site for effective and safe transgene insertion in human cells. Our analysis establishes AAVS1 as a site for sustainable   |  ADVANCE ONLINE PUBLICATION  |  nature methods

2

3

4

5

gene transfer that prevents confounding and detrimental effects of integration, and as a site that can be used in both experimental and therapeutic settings. RESULTS Integration and transgene expression at CCR5 and AAVS1 We constructed enhanced GFP (EGFP) expression cassettes driven by different promoters of viral origin (spleen focus ­forming virus (SFFV) promoter) and cellular origin (phosphoglycerate kinase (PGK) promoter; and eukaryotic elongation factor 1 alpha (EEF1A1, here referred to by the synonym EF1A) promoter with or without an intron), all terminating at the same poly(A) site. We flanked each cassette by sequences homologous to CCR5 or PPP1R12C to promote targeted integration at either locus (Fig. 1a). We then delivered the appropriate combination of ZFNs and EGFP expression cassette to human B-lymphoblastoid cells by integrase-defective lentiviral vectors (IDLVs) resulting in 7–12% green fluorescent (EGFP+) cells for each construct at both targeted loci (Fig. 1b). Molecular analyses performed on the EGFP+ sorted

Articles AAVS1 ­outperforming CCR5 (Fig. 1e). We reproduced these findings using another transgene (Supplementary Fig. 2). The lower transgene expression we observed at CCR5 was mediated in part by cis-acting DNA sequences because insertion into AAVS1 of a transgene ­expression cassette together with as little as a 0.9-­kilobase (kb) DNA fragment cloned from the CCR5 locus resulted in twofold lower EGFP MFI compared to transgene expression from the unmodified cassette targeted into the same locus (Fig. 1f).

cell pools and single cell–derived clones (32 clones for CCR5 and 77 clones for AAVS1) confirmed transgene integration at the target site by homology-directed repair, either as a single cassette or concatemer, in >90% of the cells (Fig. 1c,d and Supplementary Fig. 1). Although ­ integration efficiency (­percentage of EGFP+ cells) was unaffected by the ­ target site, the mean fluorescence intensity (MFI, a measure of the average expression per cell) of EGFP depended on both the promoter and the target locus, with

M o EG ck SoFP – r C ted lo n So es r C ted lo n So es r C ted lo n So es r C ted lo ne s

B2M CCR3 fold change fold change

F1

2

C

G

TD

LR

LU

R

ZZ

P7 LR P2

T

TP

R

LF

1

3

P

**

C

C

C

C

R

R

L2

5

* **

2

R

C

C

C

C

C

C

R

R

1

3

*

Fold change

nt

f5 1 SY T PT 5 TM PR EM H 86 B SA PS 1 H SB P B R SK

I3

N

or

TN

C

19

C

T1

12

N

R

TN

L1

EP

P

SB

H

B

SA

86

PS 1

H

PR

TM

EM

5

SY T

PT

1

T1

or f5

12

N

19

C

TN

P1

R

13

H

PP

C

2

D

R

R

LR

C

* **** * ** 5

R

3

R

C

C

C

C

C C

R 1

*** ***

Fold change

PP

P1

S8

13

H

D

G

R

LR

N

N

N

R

C

R

oE

C

H

AT

Ap

hA

ET

FV

SF

P6

Fold change

EF

1A

_i

1A

EF

SF

FV

K

PG

oc

k

∆Ct

M

MFI (× 103 a.u.)

© 2011 Nature America, Inc. All rights reserved.

***

CCR5 fold change

Figure 2 | Upregulation of endogenous genes EF1A-EGFP-pA a b PGK-EGFP-pA LUZZP SFFV-EGFP-pA EF1A_intron-EGFP-pA at the integration site depends on the CCR2 TDGF1 20 kb CCR3 P < 0.001 CCR5 CCRL2 RTP3 exogenous promoter and target locus. 250 (a) To-scale representation (top) of the 200 400 kb genomic region centered on CCR5 on CCR1 LTF LRRC2 Targeted integration human chromosome 3p21. Genes (boxes 150 P < 0.01 indicate exons) and their transcripts (arrows) Mock 1,000 100 – EGFP are shown. The cassette integration site in PGK-EGFP-pA 50 100 SFFV-EGFP-pA exon 3 of CCR5 and its transcriptional EF1A-EGFP-pA 0 10 orientation are indicated by the red arrow. EF1A_intron-EGFP-pA P < 0.001 10 8 Fold changes in expression (bottom) measured P < 0.05 1 6 4 by qRT-PCR of the indicated genes in EGFP+ ND ND ND 2 0.1 0 sorted B-lymphoblastoid cells relative to EGFP− 2 sorted and mock-treated cells, upon targeting 1 ∆Ct the indicated cassettes into CCR5. For RTP3, 0 8 NA 14 7 13 13 12 NA NA wild type: 5 LUZZP, LRRC2 and CCR1 n = 2 for the EF1A and c EF1A_intron cassettes; thus, mean ± range is EGFP CCR5 + –5 EGFP indicated. For all other cassettes and all genes 0 Targeted integration d 5 or conditons n = 6–12; thus, mean ± s.e.m. is Mock 10 NCR1 NLRP2 EPS8L1 BRSK EGFP– 15 indicated. *P < 0.05; **P < 0.01; ***P < 0.001 20 PGK-EGFP-pA (one-way Anova with Bonferroni’s multiple SFFV-EGFP-pA SAPS1 NLRP7 GP6 RDH13 comparison post-test). ND, not detectable. EF1A-EGFP-pA SYT5 20 PPP1R12C TMEM86B kb EF1A_intron e Expression of each gene in mock-treated cells TNNT1 PTPRH -EGFP-pA HSBP TNNI3 10 relative to B2M is indicated by change in cycle CCR5 AAVS1 25 C19orf51 threshold (Ct); lower ∆Ct correlates with higher 20 1 15 expression. NA, not applicable (Ct ≥ 37). Dashed 10 ND ND ND 5 line indicates the reference value in mock0.1 0 treated cells. (b) Fold change in CCR5, CCR3 and B2M (latter for normalization) expression in the indicated samples relative to mock∆Ct wild type: 9 7 9 12 12 8 NA 5 10 12 NA 10 NA 9 –1 8 treated cells. Horizontal bars indicate mean CCR5 locus AAVS1 locus 1,000 f fold change; statistical analysis as in a. Based Mock – EGFP 100 on the variance in B2M and HPRT expression SFFV-EGFP-pA among all 200 samples analyzed, fold changes 10 ET-EGFP-pA hAAT-EGFP-pA of 0.6–1.4 (dashed lines in the bottom plot) 1 ApoE-EGFP-pA were not considered relevant. (c) Comparison HCR-EGFP-pA 0.1 of EGFP and CCR5 expression, expressed as ∆Ct over B2M value, for the indicated cassettes in the EGFP+ B-lymphoblastoid cells from samples ∆Ct wild type: 11 2 14 6 7 5 7 10 8 2 7 –2 5 in a and b (mean ± s.e.m., n = 8–12 samples). (d) Similar analysis as in a performed for the CCR5 fold change AAVS1 locus Ins 1 2 g 400-kb genomic region surrounding AAVS1 ZFNs 0 50 100 150 200 Cloned into on chromosome 19q13 upon targeting of CCR5 Prom. SFFV EGFP indicated cassettes. For all genes tested n = 2 P < 0.05 InsF CCR5 Prom. SFFV EGFP for the SFFV, EF1A and EF1A_intron cassettes; thus mean ± range is indicated by error bars. P < 0.001 CCR5 Prom. InsR SFFV EGFP For the PGK cassette n = 4; thus mean ± P < 0.001 1 2 Exon 3 CCR5 Prom. s.e.m. is indicated by error bars. Dashed line, ZFNs reference value for mock-treated cells. (e) EGFP MFI (arbitrary units; a.u.) of the indicated expression cassettes upon targeted integration into CCR5 or AAVS1 of HepG2 cells (mean ± s.e.m., n = 3). ET, enhanced transthyretin promoter (synthetic hepatocyte-specific promoter); hAAT, proximal promoter of the human α1-antitrypsin gene; ApoE, three copies of the enhancer of the apolipoprotein E gene cloned upstream of the hAAT promoter; and HCR, hepatic control region of the apolipoprotein genes cluster cloned upstream of the hAAT promoter. (f) EGFP+ sorted HepG2 cells from e were analyzed as in a and d for expression of the indicated genes (mean ± s.e.m., n = 3 sorted cell pools per promoter per locus). Dashed line indicates the reference value in mock-treated cells. (g) CCR5 expression in EGFP+ lymphoblastoid cells upon targeting the SFFV-EGFP-pA cassette into CCR5 with or without the AAVS1 insulator (Ins) cloned in forward (InsF) or reverse (InsR) orientation (mean ± s.e.m.; n = 3 sorted cell pools per cassette). Prom., promoter. nature methods  |  ADVANCE ONLINE PUBLICATION  |  

Articles

** ***

From the exogenous promoter

Cap

Cap Cap

AAA Cap

2 3 4 22

AAA Cap

Exon 1

Wild type

d

EF1A-EGFP-pA or SFFV-EGFP-pA

Sorted cells

2

1

G -E K*

Exon 1 SASA

2 3 4 22

EF1A_intron-EGFP-pA

f

AAVS1 genotype

-p FP

oc M 1 kb 0.5 kb

1 kb 0.5 kb

A

0.5

EGFP

2 3 4 22

PGK-EGFP-pA

SA SD

ZFNs Exon 1 EF1A

Exon 2 EGFP

SD SA EGFP expression (a.u.)

PGK : CTCTCCCCAGGGG PGK*: CTCTCCCCACCGG

k

M

3′ Intron Exon

PG

M

PGK SFFV EF1A EF1A_intron

Bulk sorted

Mock PGK EF1A_intron Mock PGK EF1A_intron

Single cell clones

b

5′

PPP1R12C fold change

+

Exon 1 SA

e

Biallelic

Sorted

Monoallelic

Biallelic

Sorted

Monoallelic

Biallelic

Sorted

Monoallelic

Biallelic

Monoallelic



Sorted

Mock

EGFP

Clones

AAA

AAA EGFP

2 3 4 22

EGFP

Clones

AAA

Cap

AAA

EGFP

Exon 1

AAVS1 modification

Clones

Cap

AAA

ZFNs

AAVS1 locus

3 2 1 0

AAA

Cap

From the endogenous promoter

Transcripts

Clones

© 2011 Nature America, Inc. All rights reserved.

c

5

3

2

EF1A

EGFP

SD SA 10

4

4

10

EF1A

2 1 0.5 0.25

5

10 10

CFP

PPP1R12C fold change

** ***

EGFP

B2M fold change

2 1 0.50 0.25 0.13 0.06 0.03 0.02 0.01

EF1A-EGFP-pA EF1A_intron-EGFP-pA

10 100%

10 2 –10 2 2 –10 10

3

10

2

11%

10 2 –10 5 3 4 5 –102 102 3 4 10 10 10 10 10 10 CFP expression (a.u.)

Mock

54

0 0 10

+

+

EGFP + CFP

3

4

EGFP – CFP

g Cell counts

PGK-EGFP-pA SFFV-EGFP-pA

^

PPP1R12C fold change

a

Same (MFI = 2,100 a.u.) Reverse (MFI = 2,300 a.u.) 1

2

10 10 10 10 EGFP expression (a.u.)

RT RT

No RT

Figure 3 | Cassette design for unperturbed target gene expression. (a) Fold change in PPP1R12C and B2M expression in EGFP+ cells from sorted pools shown in Figure 2d and single cell–derived clones characterized for mono- or biallelic targeted integration of the indicated EGFP expression cassettes relative to mock-treated and EGFP− sorted cells (analysis as in Fig. 2d). Based on the variance in B2M and HPRT expression among all 200 samples analyzed, fold changes of 0.6–1.4 (dashed lines in the bottom plot) were not considered relevant. (b,c) RT-PCR analysis (b) of the indicated EGFP+ cells from a shows aberrant PPP1R12C transcripts that contain exon 1 of the gene (c; orange line) spliced to EGFP (c; green line) in cells with targeted integration of the PGK and EF1A_intron cassettes. (d) PPP1R12C expression in EGFP+ sorted B lymphoblastoid cells upon targeting the modified PGK*-EGFP-pA cassette into AAVS1 (n = 3 sorted cell pools) relative to mock-treated cells. The two changes introduced in the PGK promoter sequence to abolish a splice acceptor (SA) site are indicated. (e) Representative flow cytometry analysis of a cell clone with monoallelic integration of the EF1A_intron-EGFP-pA cassette into AAVS1 before (left; gate region indicates the single-positive cells, EGFP+ CFP−; percentage indicated) and after (right; gate region indicates the double-positive cells, EGFP+ CFP+) targeting of an EF1A_intron-CFP-pA cassette in reverse orientation relative to the transcription of the endogenous gene in the residual wild-type AAVS1. Schematics on top depict the AAVS1 genotype. (f) Fold changes in PPP1R12C expression in the single (EGFP+ CFP−) and double-positive (EGFP+ CFP+) sorted B lymphoblastoid cells from the right plot in e (n = 3–6 sorted cell pools per condition) relative to mock-treated cells. (g) Flow cytometry data showing comparable EGFP expression upon targeted integration of an EF1A_intron cassette into AAVS1 in the same or reverse (r-EF1A_intron) orientation relative to PPP1R12C transcription.

Integration into AAVS1 did not upregulate nearby genes We next determined the impact of the transgene cassette on the expression of the target and flanking genes. We determined expression of 26 genes found in the 400-kb windows centered on either the CCR5 or AAVS1 target site by quantitative reverse transcription PCR (qRT-PCR) analysis of EGFP + and EGFP− sorted cell pools. For the CCR5 locus, although the EF1A promoter with or without the intron had no effect on the expression of any gene studied, the PGK promoter upregulated both CCR5 (ninefold) and CCR3 (twofold), whereas the SFFV promoter upregulated three genes up to 180-fold (CCR5) and at distances of up to 200 kb (CCR1) from the insertion site (Fig. 2a). Analysis of 20 single cell–derived clones with molecularly confirmed targeted integration in CCR5 demonstrated the reproducibility of these findings (Fig. 2b). Transcriptional upregulation of the locus was independent of the extent of EGFP expression, as the EF1A_intron promoter drove higher EGFP expression than the SFFV promoter without impacting endogenous gene expression (Fig. 2c). Notably, the same type of analysis performed at the AAVS1 locus revealed no significant (P > 0.05) upregulation in gene expression of flanking genes by any of the cassettes (including the cassette with the SFFV promoter) in this gene-dense region (Fig. 2d).   |  ADVANCE ONLINE PUBLICATION  |  nature methods

We then extended the analysis to other cell types and compared ubiquitous and tissue-specific promoters. We targeted EGFP ­expression cassettes either driven by the SFFV or hepatocyte-­specific promoters20–23 into AAVS1 or CCR5 of HepG2 hepatocyte cell line. The EGFP+ sorted cells showed targeted insertion in at least one of the AAVS1 or CCR5 alleles (Supplementary Fig. 3). As in lympho­ blastoid cells, AAVS1 always supported higher transgene expression than CCR5 (Fig. 2e). In CCR5-targeted cells, we measured significant (P < 0.05) upregulation of CCR5 expression by all promoters and of CCR1 by two of the hepatocyte-specific promoters (n = 3 sorted cell pools per cassette per locus; Fig. 2f). In contrast, we observed no significant upregulation at the targeted and flanking genes for any of the cassettes inserted into AAVS1 (Fig. 2f). This was also true upon targeting the PGK- and SFFV-driven EGFP cassettes into AAVS1 of the hematopoietic progenitor cell line K-562 (n = 3 sorted cell pools per cassette; Supplementary Fig. 4). Thus, whereas CCR5 and some of its flanking genes could be upregulated by ubiquitously expressed and tissue-specific cassettes in different cell types, AAVS1 and its flanking genes appear to be more generally resistant to such effects. The AAVS1 resistance to transcriptional upregulation may be due to a chromatin insulator in the promoter of PPP1R12C24,25. Indeed, insertion of a 350-bp region encompassing the ­putative

Articles

Cassette design for unperturbed target gene expression Because the AAVS1 target site maps in intron 1 of PPP1R12C, we investigated whether insertion of exogenous splice and poly(A) Wild-type cells

TSS

ZFNs

Poly(A)

b

Biallelic integration SFFV EGFP

CCR5 locus

Percentage of input

0.5 kb Unrelated antibody H3 H3K4me2 H3K9ac H3K36me3 H3K9me3 H3K27me3 RNA PolII

ZFNs TSS

–350

–180

CCR5 8 7 6 5 4 3 2 1 0

+400 +1,800 +2,600 +3,400 +4,800 +5,900

+70

Exon 3

c

–350

–180

+70

+400

Biallelic integration

Poly(A)

ZFNs

SFFV EGFP

PPP1R12C locus PPP1R12C

10 kb

ZFNs

Exon 1

1 kb

TSS

5 0

SFFV

EGFP BGHpA

+1,800 +2,600 +3,400 +4,800 +5,900

25 20 15 10 5 0

SFFV

EGFP BGHpA

10 8 6

TSS

4 2 0

–400

Biallelic integration EF1A

–1,670

+400 +720 +1,470 +2,120 +2,820 +3,900 Distance from TSS (bp)

EGFP

SD SA

PPP1R12C Exon 1

40 35 30 25 20 15 10 5 0

14

–860

–400

+400

f

Biallelic integration PGK

EGFP

PPP1R12C Exon 1 EGFP

+720

+1,470 +2,120 +2,820 +3,900

Distance from TSS (bp) Percentage of input

–860

Percentage of input

Percentage of input

ZFNs

–1,670

e

10

12

Exon 2

Investigated regions 8 7 6 5 4 3 2 1 0

Exon 1

14

Percentage of input

TSS

15

Distance from TSS (bp)

d

Wild-type cells

20

TSS

Distance from TSS (bp)

BGHpA

30 25 20 15 10 5 0

14

12

EGFP

BGHpA

12 Percentage of input

Percentage of input

© 2011 Nature America, Inc. All rights reserved.

8 7 6 5 4 3 2 1 0

Exon 3

Percentage of input

Exon 1 Exon 2

Investigated regions

Percentage of input

a

Percentage of input

sites would perturb transcription of the endogenous gene in lymphoblastoid cell clones validated for mono- or biallelic integration (n = 74 cell clones; Fig. 3a). PPP1R12C expression was significantly reduced by ~50% (P < 0.01) and nearly 100% (P < 0.001) in cells containing the EF1A_intron cassette integrated into one or both alleles, respectively. We observed a similar pattern for the PGKdriven cassette, albeit with a lower but significant (P < 0.01) reduction that was enhanced (P < 0.001) in biallelically ­modified clones.

PPP1R12C insulator upstream of the SFFV-driven cassette into CCR5 resulted in significantly reduced CCR5 upregulation (P < 0.001 for the InsR-SFFV-EGFP-pA cassette, n = 3 sorted cell pools per cassette; Fig. 2g and Supplementary Fig. 5).

10 8 6 TSS

4 2

10 8 6

TSS

4 2

0

0 –1,670

–860

–400

+400

+720

+1,470 +2,120 +2,820 +3,900

Distance from TSS (bp)

–1,670

–860

–400

+400

+720

+1,470 +2,120 +2,820 +3,900

Distance from TSS (bp)

Figure 4 | Epigenetic analysis of CCR5 and AAVS1 before and after transgene insertion. (a) Schematic of the CCR5 locus in wild-type cells showing the transcription start site (TSS) and poly(A) site, the ZFN target site in exon 3 and the 9 regions investigated by qPCR (bars). ChIP analysis of wild-type B lymphoblastoid cells (bottom) showing the percentage of enrichment in RNA PolII (shaded area), histone H3 and the indicated histone H3 modifications (colored bars) versus the input, at the indicated distance from the TSS in base pairs (bp). (b) Similar analysis as in a performed on a cell clone containing biallelic integration of SFFV-EGFP-pA cassette (schematic and inset) in CCR5. (c) Schematic of PPP1R12C and magnification of the genomic region chosen for the analysis. ChIP analysis of wild-type B lymphoblastoid cells as in a. (d–f) ChIP analysis of cell clones containing biallelic integration of the indicated EGFP cassette (schematics and insets) into AAVS1 (bottom histograms). Shown are means or individual measurements (dots) from two independent ChIP experiments each analyzed by replicate qPCRs. nature methods  |  ADVANCE ONLINE PUBLICATION  |  

Articles b PGK EGFP

CCR5 AAVS1

SFFV EGFP EGFP

SD SA PGK* EGFP

SA SD

0

1

© 2011 Nature America, Inc. All rights reserved.

Relative MFI

d 16 14 12 10 8 6 4 2 0

100

2 3 4 5 + EGFP cells (%)

6

12 kb 9 kb

80 60 20 0

2

e

SA SD SD SA

SFFV EGFP

PGK* EGFP

EGFP

EF1A

HT-TI TI AAVS1

3

BstXI BglI

BglI BstXI

TI

BglI

BstXI BstXI AAVS1 BglI BglI BstXI BstXI ZFNs

TI (%): 66 63 61 NA

GP6

RDH13 PPP1R12C TNNT1 PTPRH SAPS1 HSBP TMEM86B TNNI3 SYT5 C19orf51

1 0.1

Mock EGFP– PGK-EGFP-pA PGK*-EGFP-pA SFFV-EGFP-pA EF1A_intron-EGFP-pA r.EF1A_intron-EGFP-pA

20 kb

EPS8L1

NLRP7

EGFP

HT-TI

BglI

Targeted integration NLRP2

10

EF1A

AAVS1

TI (%): NA 59 59

12 9 6 5 4

TI

3 kb

4 6 8 10 12 14 16 18 Time after transduction (d)

CCR5 AAVS1

PGK EGFP

HT-TI

6 kb 5 kb 4 kb

40 0

EF1A

Digestion BstXI

*** ****

EGFP

120

Fold change

EF1A

140 Cell count (× 106)

PGK EGFP

c Digestion BglI

Mock Donor (IDLV) Donor and ZFNs (IDLV) (Ad5/35)

Donor and ZFNs

Mock PGK PGK*

Donor

SFFV EF1A_intron r.EF1A_intron Mock

a

ND NLRP7

∆Ct NA wild type:

ND NLRP2

GP6

4

NA

ND

ND

ND

ND

ND

RDH13 EPS8L1 PPP1R12C TNNT1 TNNI3 C19orf51 SYT5

5

NA

2

NA

NA

NA

NA

ND PTPRH TMEM86B SAPS1

NA

5

–2

HSBP

5

Figure 5 | AAVS1-targeted integration and transgene expression without perturbing endogenous gene expression in primary human T lymphocytes. (a) Frequency (mean ± s.e.m., n = 3 blood donors per cassette) of EGFP+ T cells 3 weeks after targeting the indicated cassettes into CCR5 or AAVS1. (b) Growth curves of T cells treated as indicated (mean ± s.e.m., n = 3 samples per condition). Ad5/35, adenoviral vector expressing the ZFNs. (c) Southern blot showing targeted integration of the indicated EGFP cassettes into AAVS1 (locus probe) of sorted EGFP+ lymphocytes from a. Schematics as in Figure 1d. NA, not applicable. TI, percentage of AAVS1 alleles with targeted integration. (d) Histogram showing the relative EGFP MFI of T lymphocytes with targeted integration of the indicated cassettes into CCR5 or AAVS1 normalized to the amount of the PGK cassette into CCR5 (mean ± s.e.m., n = 3). (e) Fold change in expression of the indicated genes in sorted EGFP+ T cells upon targeting the indicated cassettes into AAVS1, relative to EGFP− sorted and mock-treated cells (mean ± s.e.m., n = 3). Statistical analysis as in Figure 2d. ***P < 0.001; ****P < 0.0001. Dashed line indicates the reference value in mock-treated cells.

We observed no significant changes in PPP1R12C expression when the SFFV-driven or EF1A-driven cassettes were targeted into one or both alleles. As all cassettes had the identical poly(A) sequence, the promoter-specific reduction in PPP1R12C expression cannot be attributed solely to transcriptional termination induced by insertion of this exogenous site. Rather the presence of deliberately introduced (EF1A_intron) or cryptic (PGK) splice acceptor sites in the expression cassettes generated aberrant PPP1R12C transcripts containing exon 1 of the gene spliced to the acceptor sites upstream of EGFP and terminating at the exogenous poly(A) site (Fig. 3b,c and Supplementary Fig. 6). As expected, a cassette lacking the poly(A) site did not downregulate PPP1R12C expression, whereas a construct lacking a promoter but containing a splice acceptor site downregulated PPP1R12C expression (Supplementary Fig. 7). To counteract the detrimental effects on expression of the targeted gene, we eliminated the splice acceptor site from the PGK promoter (PGK*), targeted the modified cassette into AAVS1 and found unperturbed PPP1R12C expression in the EGFP+ sorted cells (n = 3 sorted cell pools; Fig. 3d). We also inserted the EF1A_intron expression cassette in the reverse orientation (r-EF1A_intron) with respect to PPP1R12C transcription, and targeted it into the residual allele of cell clones with a knockout insertion in the other allele (Fig. 3e). Insertion in reverse orientation did not perturb PPP1R12C expression while maintaining strong transgene expression (n = 3 sorted cell pools; Fig. 3f,g). These data indicate that robust transgene expression can be obtained from within a cellular gene, in both divergent and ­convergent   |  ADVANCE ONLINE PUBLICATION  |  nature methods

orientation, without interference with endogenous transcription, provided that the expression cassette is optimally designed. Epigenetic features at the target loci To assess the epigenetic response to insertion and correlate it with transcriptional output, we performed chromatin immunoprecipitation (ChIP) analysis on the targeted region of AAVS1 and CCR5 in untreated lymphoblastoid cells and in clones carrying biallelic insertion of the different types of cassette. We mapped RNA polymerase II (PolII), histone H3 and its post-translational modifications associated with open chromatin and transcription (H3K4me2; H3K9ac), productive PolII elongation (H3K36me3) and repression (H3K27me3 and H3K9me3)26. CCR5 was expressed at a very low level (Fig. 2a) and showed repressive histone marks and scarce PolII association (Fig. 4a). Insertion of the SFFV cassette resulted in reduced H3K27me3 repressive marks throughout the gene body and substantial increase in PolII and markers of active transcription (H3K36me3 around the integration site and H3K4me2 at the transcription start site (TSS); Fig. 4b). PolII and markers of actively transcribed chromatin were strongly enriched on the exogenous cassette, albeit with a concomitant enrichment in the H3K9me3 repressive mark (Fig. 4b). Contrary to CCR5, AAVS1 was expressed to a much higher level (Fig. 2d) and showed typical marks of open and transcribed chromatin, PolII enrichment and a nucleosome-free region around the TSS (Fig. 4c). Insertion of the transgene cassettes enhanced PolII and the preexisting H3K36me3 elongation marks at both sides of

Articles DAPI

c EGFP

6 4 2 0 35

50

EGFP–

EGFP+

80 60 40 20 0

GFAP TUJ1 GFAP TUJ1 Mock

90

e

© 2011 Nature America, Inc. All rights reserved.

100

DAPI

EGFP

TRA 1–60

f

iPSC clones

iPSC clones

h

Donor and ZFNs

d EGFP+ cells (%)

b

Immunoreactive cells (%)

Donor Donor and ZFNs

GFAP EGFP TP3

EGFP+ cells (%)

8

TUJ1 EGFP TP3

a

0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

Donor Donor and ZFNs

2 months

3 months

the integrations (Fig. 4d–f and Supplementary Fig. 8). In genetargeted versus untreated cells, we observed more nascent transcripts both in sense and antisense orientation between the TSS of PPP1R12C and the transgene insertion site (Supplementary Fig. 9). PolII and histone marks of active transcription were strongly enriched in all the cassettes, whereas marks of repression were underrepresented (Fig. 4d–f and Supplementary Fig. 8). Only for the EF1A_intron cassette, which we showed forces termination of PPP1R12C transcription, repressive marks were placed downstream of the insertion site (Fig. 4f). In summary, all insertions remodeled the epigenetic features of the target locus, although in different manner and with distinct transcriptional outcomes at the two loci. Sustainable gene transfer in T cells To achieve targeted integration in primary T lymphocytes, we used an adenoviral vector (Ad5/35) to deliver ZFNs and IDLVs as donor DNA. We achieved up to 5% EGFP+ cells across multiple independent experiments targeting either CCR5 or AAVS1 (22 gene targeting experiments from 11 normal blood donors, Fig. 5a), without selection for the desired events. Gene-targeted lymphocytes had a normal growth rate (Fig. 5b), and could be expanded and sorted to 95.4% ± 0.7% (mean ± s.e.m., 12 cell sortings from independent experiments) EGFP+ cells. Molecular analysis confirmed targeted integration in up to 83% of the target alleles in the EGFP+ cells and in 12 of 13 analyzed clones (Fig. 5c and Supplementary Fig. 10). In agreement with our observations in lymphoblastoid cells, EGFP

GT1 GT2 GT3

NTC

GT1 GT2 GT3

EGFP DAPI Nestin Figure 6 | Transgene M + + expression in human stem cells and their 2 kb 1 kb progeny after targeted Sox2 Pax6 0.5 kb integration into AAVS1. 5′ junction 3′ junction (a) Percentage of EGFP+ NSCs at the indicated g Nanog Sox17 time after transduction with PGK-EGFP-pA donor IDLV (with or without AAVS1 ZFNs), Oct4 SMA from a representative experiment of three. (b) Immunofluorescence images of neurospheres derived from the donor and ZFNs treated NSCs in a. Scale bar, 100 µm. (c) Representative confocal images of in vitro differentiated NSCs from b showing EGFP expression in neurons (TUJ1-immunopositive cells, left) or astrocytes (GFAP immunopositive cells, right). Merged signals are yellow; nuclei were counterstained with TO-PRO-3. Scale bar, 50 µm. Percentage of EGFP+ neurons or astrocytes upon in vitro differentiation of the indicated NSCs (right; mean ± s.e.m., n = 3). (d) Percentage of EGFP+ human iPSCs at the indicated time after transduction with EF1A_intron-EGFP-pA donor IDLV (with or without AAVS1-ZFNs; mean ± s.e.m., n = 3). (e) Representative immunofluorescence images of EGFP+ (green) iPSC clones showing expression of pluripotency markers (red). Scale bar, 120 µm. (f) PCR analyses for the 5′ and 3′ junctions generated by targeted integration of the EGFP cassette into AAVS1 of EGFP+ iPSC clones. NTC, no template control. +, positive control for the PCR from a cell clone containing targeted integration of the EF1A_intron cassette. M, marker. (g) Phase contrast (top) and fluorescence microscopy (bottom) of 6-d-old embryoid bodies derived from the GT1 iPSC clone. (h) Representative immunofluorescence images of in vitro–differentiated GT1 iPSCs showing expression of lineage-specific markers (red). Nestin and Pax6 are expressed by neural rosettes; Sox17 or SMA are expressed by endoderm- or mesoderm-derived cells, respectively. DNA was stained with DAPI. Scale bar, 120 µm.

expression from a given promoter was higher when expressed from AAVS1 than CCR5 (Fig. 5d). Robust expression at AAVS1 was reproduced in primary T cells upon targeting another transgene, CFP, whereas we observed stable transgene expression in T cells upon long-term culture (30 d) and switch to resting conditions (Supplementary Fig. 11). We then assessed endogenous gene expression at and around AAVS1 in the EGFP+ lymphocytes (n = 3 sorted cell pools per promoter; Fig. 5e) and found no significant upregulation of any flanking gene. Whereas expression of the targeted PPP1R12C was significantly downregulated by insertion of the PGK (P < 0.001) and EF1A_intron (P < 0.0001) cassettes, it was unaffected by insertion of the mutated PGK* promoter or the r-EF1A_intron cassette. These data show that it was feasible to rapidly generate primary T cells robustly expressing a transgene from a preselected site in the genome without inducing any detectable alteration of endogenous transcription. Targeted integration in neural stem cells and iPSCs We assessed the stability and robustness of transgene expression from our optimally designed expression cassettes in AAVS1 of human neural stem cells (NSCs) and iPSCs. In NSCs, we observed 6% EGFP+ cells without selection, which could be stably maintained upon serial passaging in bulk neurosphere cultures for up to 3 months (Fig. 6a,b). EGFP+ NSCs differentiated into ­neuronal or astroglial cells in similar ratio as untreated cells (Fig. 6c). Targeted integration into AAVS1 in iPSCs reached ~1% efficiency nature methods  |  ADVANCE ONLINE PUBLICATION  |  

Articles

© 2011 Nature America, Inc. All rights reserved.

without selection. We stably maintained EGFP+ iPSCs in bulk culture for up to 3 months (Fig. 6d) and could expand them as single cell–derived clones. These clones homogeneously expressed EGFP and pluripotency markers (Fig. 6e), and we analyzed them to confirm site-specific integration into AAVS1 (Fig. 6f). We observed uniformly EGFP+ embryoid bodies from these clones and stable transgene expression upon differentiation into the three germ layers (Fig. 6g,h). These data indicate that ZFN-mediated targeted integration into AAVS1 was well tolerated by human stem cells and allowed stable and robust transgene expression in these cells and their differentiated progeny. DISCUSSION The target locus had a large impact on transgene expression, with AAVS1 consistently supporting higher transgene expression than CCR5 in several human cell types and when using multiple promoters. Whereas CCR5 showed repressive chromatin marks and was largely restricted in its expression to some blood cell lineages, PPP1R12C had an open and active chromatin and was constitutively transcribed in most cell types. Transgene insertion at CCR5 resulted in substantial perturbation of preexisting negative regulation of transcription. The targeted and flanking genes were upregulated, and a substantial fraction of the preexisting repressive chromatin marks were removed around the integration site. Upregulation of flanking gene expression was dependent on the type but not on the strength of the exogenous promoter. Expression of the transgene, however, was lower relative to when we targeted the same cassette into AAVS1, suggesting an antagonism between the repressive features of the locus and the expression proficiency of the cassette. This contention is supported by the co-enrichment of active and repressive chromatin marks when we targeted the cassette into CCR5, and the identification of a cis-acting region at the 3′ end of the CCR5 insertion site that downregulates transgene expression when targeted into AAVS1. In primary T lymphocytes, in which CCR5 expression is greater than PPP1R12C expression (data not shown), transgene expression remained constrained at CCR5, indicating that some features of a tightly regulated gene may adversely impact transgene expression even upon activation. Transgene insertion at AAVS1 did not upregulate the targeted and flanking genes, even when we tested strong ubiquitously expressed or tissue-specific promoters. The resistance of this locus to perturbation may be due to the presence of chromatin insulators24,25 that prevent enhancer-stimulated gene expression. Upon targeting the putative insulator of AAVS1 into CCR5, upregulation induced by an exogenous promoter in the latter locus was reduced. The enhanced chromatin activation and elongation marks observed in PPP1R12C without upregulated expression may be explained by bidirectional transcription originating from the exogenous promoters (as shown by nascent transcript analysis and increased PolII association) that collides with and slows down PPP1R12C transcription. Insertion into AAVS1 is thus preferable not only because this locus is ubiquitously expressed and permissive to robust transgene expression but also because local chromatin elements may shield the endogenous genes nearby the insertion site from transcriptional interference. We obtained robust transgene expression from an intronic insertion without interfering with expression of the targeted gene. This outcome requires eliminating exogenous splice sites   |  ADVANCE ONLINE PUBLICATION  |  nature methods

that cross-talk with overlapping endogenous transcription and cause aberrant splicing, premature termination and deposition of repressive chromatin marks downstream of the exogenous poly(A) site. This rule may be applicable to randomly integrating vectors27 as well; it is conceivable that the risk of insertional mutagenesis may be reduced by avoiding splice sites in their sequence. One may predict that transgene insertion that does not transcriptionally perturb the target locus should have no detrimental consequences in the transduced cells even if the function of the locus has not been fully investigated. Such data, together with the evidence for robust and stable transgene expression, may justify calling AAVS1 a suitable site for ‘safe’ transgene insertion in the cell types assayed. A recent study reported criteria to identify safe integrations in human iPSCs, based on selecting clones carrying vector insertions sufficiently distant from any known gene and ultraconserved region28. Whereas this study, like ours, recognizes the need to rule out transcriptional perturbation of endogenous genes to claim a safe insertion, our approach can be used to reproducibly establish transgene activity at a preselected genomic site and in bulk-treated cells. Whereas intergenic sites may appear preferable for gene targeting, we note that they may result in poor and unstable transgene expression, especially when targeting a gene desert or a heterochromatic region. Moreover, limited knowledge on the function of most intergenic regions and the pervasive transcription of the whole genome29 caution against assuming no underlying function at an intergenic site. In contrast, intronic regions of widely expressed genes may represent more suitable targets for transgene insertion. As we have shown, one can examine whether intronic insertion perturbs expression of the locus and/or whether knockout of the targeted gene shows a phenotype in the relevant cell types. In gene-function studies, sustainable gene transfer can ensure robust and predictable expression of an exogenous sequence, whether encoding a protein, interfering RNA or noncoding RNA, while limiting confounding factors of random integration. In studies of genome organization, we demonstrate the feasibility to rapidly generate cells carrying hemi- and homozygous insertion of cis-­acting elements at preselected loci, allowing analysis of transcriptional and epigenetic effects of the element at its natural or ectopic location in the genome of different cell types, including primary cells. Sustainable gene transfer in stem cells makes possible neutral cell marking and reliable tracking of cell progeny in vivo and will increase the safety of cell-based therapies. We report site-specific gene delivery in primary T lymphocytes by a method that can be easily adapted to current clinical protocols of adoptive immunotherapy and used for the treatment of infections, cancer and autoimmune disorders. Additional studies conducted according to the principles and techniques reported here should identify other genomic ­acceptor sites suitable for sustainable gene transfer in model organisms. Methods Methods and any associated references are available in the online version of the paper at http://www.nature.com/naturemethods/. Note: Supplementary information is available on the Nature Methods website. Acknowledgments We thank B. Celona, A. Anselmo and F. Ungaro for help with some experiments, A. Agresti, M. Bianchi and D. Gabellini for critical discussion, C. Di Serio and A. Nonis for statistical counseling, K. Ponder (Washington University, St. Louis),

Articles C. Miao (University of Washington, Seattle) and A. Recchia (University of Modena and Reggio Emilia) for providing reagents. Research was supported by Telethon (Telethon Institute for Gene Therapy grant), 7th EU Framework Programme (grant agreement 222878, PERSIST), European Research Council Advanced grant (249845 Targeting Gene Therapy), Fondazione Cariplo (Nobel Project) to L.N., Italian Ministry of Health (Giovani Ricercatori) to V.B.; Telethon (TGT06B02) to A.G.; Italian Ministry of Research and University (Ideas), Italian Ministry of Health (Giovani Ricercatori), Associazione Italiana per la Ricerca sul Cancro and Fondazione Cariplo to C.B. AUTHOR CONTRIBUTIONS A.L., D.C., P.G., B.D.S., E.P. and D.F.C. designed and performed experiments, and interpreted data. M.N., Z.M., A.C., P.L.R., M.D. and O.M.P. performed experiments and interpreted data. M.C.H. and P.D.G. provided ZFNs. A.G., V.B. and C.B. coordinated NSC, iPSC and T-cell work, respectively. A.L. and L.N. conceived the project, coordinated all work and wrote the paper.

© 2011 Nature America, Inc. All rights reserved.

COMPETING FINANCIAL INTERESTS The authors declare competing financial interests: details accompany the full-text HTML version of the paper at http://www.nature.com/naturemethods/. Published online at http://www.nature.com/naturemethods/. Reprints and permissions information is available online at http://www.nature. com/reprints/index.html. 1. Ellis, J. Silencing and variegation of gammaretrovirus and lentivirus vectors. Hum. Gene Ther. 16, 1241–1246 (2005). 2. Naldini, L. Ex vivo gene transfer and correction for cell-based therapies. Nat. Rev. Genet. 12, 301–315 (2011). 3. Lombardo, A. et al. Gene editing in human stem cells using zinc finger nucleases and integrase-defective lentiviral vector delivery. Nat. Biotechnol. 25, 1298–1306 (2007). 4. Zou, J. et al. Gene targeting of a disease-related gene in human induced pluripotent stem and embryonic stem cells. Cell Stem Cell 5, 97–110 (2009). 5. Maeder, M.L. et al. Rapid “open-source” engineering of customized zinc-finger nucleases for highly efficient gene modification. Mol. Cell 31, 294–301 (2008). 6. Porteus, M.H. & Baltimore, D. Chimeric nucleases stimulate gene targeting in human cells. Science 300, 763 (2003). 7. Carroll, D. Progress and prospects: zinc-finger nucleases as gene therapy agents. Gene Ther. 15, 1463–1468 (2008). 8. Urnov, F.D., Rebar, E.J., Holmes, M.C., Zhang, H.S. & Gregory, P.D. Genome editing with engineered zinc finger nucleases. Nat. Rev. Genet. 11, 636–646 (2010). 9. Gondo, Y., Fukumura, R., Murata, T. & Makino, S. Next-generation gene targeting in the mouse for functional genomics. BMB Rep. 42, 315–323 (2009). 10. Frazer, K.A., Murray, S.S., Schork, N.J. & Topol, E.J. Human genetic variation and its contribution to complex traits. Nat. Rev. Genet. 10, 241–251 (2009).

11. de Boer, B.A., Ruijter, J.M., Voorbraak, F.P. & Moorman, A.F. More than a decade of developmental gene expression atlases: where are we now? Nucleic Acids Res. 37, 7349–7359 (2009). 12. Cartier, N. et al. Hematopoietic stem cell gene therapy with a lentiviral vector in X-linked adrenoleukodystrophy. Science 326, 818–823 (2009). 13. Aiuti, A. et al. Gene therapy for immunodeficiency due to adenosine deaminase deficiency. N. Engl. J. Med. 360, 447–458 (2009). 14. Samulski, R.J. et al. Targeted integration of adeno-associated virus (AAV) into human chromosome 19. EMBO J. 10, 3941–3950 (1991). 15. Liu, R. et al. Homozygous defect in HIV-1 coreceptor accounts for resistance of some multiply-exposed individuals to HIV-1 infection. Cell 86, 367–377 (1996). 16. Smith, J.R. et al. Robust, persistent transgene expression in human embryonic stem cells is achieved with AAVS1-targeted integration. Stem Cells 26, 496–504 (2008). 17. Henckaerts, E. et al. Site-specific integration of adeno-associated virus involves partial duplication of the target locus. Proc. Natl. Acad. Sci. USA 106, 7571–7576 (2009). 18. Hockemeyer, D. et al. Efficient targeting of expressed and silent genes in human ESCs and iPSCs using zinc-finger nucleases. Nat. Biotechnol. 27, 851–857 (2009). 19. Zou, J. et al. Oxidase deficient neutrophils from X-linked chronic granulomatous disease iPS cells: functional correction by zinc finger nuclease mediated safe harbor targeting. Blood 117, 5561–5572 (2011). 20. Matrai, J. et al. Hepatocyte-targeted expression by integrase-defective lentiviral vectors induces antigen-specific tolerance in mice with low genotoxic risk. Hepatology 53, 1696–1707 (2011). 21. Hafenrichter, D.G. et al. Quantitative evaluation of liver-specific promoters from retroviral vectors after in vivo transduction of hepatocytes. Blood 84, 3394–3404 (1994). 22. Okuyama, T. et al. Liver-directed gene therapy: a retroviral vector with a complete LTR and the ApoE enhancer-alpha 1-antitrypsin promoter dramatically increases expression of human alpha 1-antitrypsin in vivo. Hum. Gene Ther. 7, 637–645 (1996). 23. Miao, C.H. et al. Inclusion of the hepatic locus control region, an intron, and untranslated region increases and stabilizes hepatic factor IX gene expression in vivo but not in vitro. Mol. Ther. 1, 522–532 (2000). 24. Ogata, T., Kozuka, T. & Kanda, T. Identification of an insulator in AAVS1, a preferred region for integration of adeno-associated virus DNA. J. Virol. 77, 9000–9007 (2003). 25. Li, C. et al. A small regulatory element from chromosome 19 enhances liver-specific gene expression. Gene Ther. 16, 43–51 (2009). 26. Zhou, V.W., Goren, A. & Bernstein, B.E. Charting histone modifications and the functional organization of mammalian genomes. Nat. Rev. Genet. 12, 7–18 (2011). 27. Cavazzana-Calvo, M. et al. Transfusion independence and HMGA2 activation after gene therapy of human beta-thalassaemia. Nature 467, 318–322 (2010). 28. Papapetrou, E.P. et al. Genomic safe harbors permit high beta-globin transgene expression in thalassemia induced pluripotent stem cells. Nat. Biotechnol. 29, 73–78 (2011). 29. Jacquier, A. The complex eukaryotic transcriptome: unexpected pervasive transcription and novel small RNAs. Nat. Rev. Genet. 10, 833–844 (2009).

nature methods  |  ADVANCE ONLINE PUBLICATION  |  

© 2011 Nature America, Inc. All rights reserved.

ONLINE METHODS Vectors and zinc finger nucleases. Homology-directed repair donor and ZFN-expressing constructs were generated from HIV-derived, third-generation self-inactivating transfer constructs; sequences and maps of the relevant part of the constructs are available in Supplementary Table 1 and Supplementary Figure 12, respectively. Cloning information is available upon request. IDLV stocks were prepared as described previously3 and titered by HIV-1 Gag p24 immunocapture assay (PerkinElmer). Retroviral vectors were produced with the pMXs constructs (Addgene) containing the human cDNAs for OCT4, SOX2, c-MYC and KLF4 as described previously30. ZFNs targeting exon 3 of the CCR5 gene or intron 1 of the PPP1R12C gene have been previously described3,18 and we transiently expressed them from IDLVs using the EF1A_intron or the SFFV promoter or from an Ad5/35 adenoviral vector using the CMV promoter. These two pairs of ZFNs contain the obligate heterodimer FokI domain31. Cell culture, transduction and differentiation. Human B-­lymphoblastoid cells, HEK293T, HepG2, K-562 and mouse embryonic fibroblasts were maintained as described previously3. For transduction, 1 × 106 cells were incubated overnight with the indicated donor IDLVs (750 ng HIV Gag p24 (p24) equivalent ml−1 for B-lymphoblastoid cells and K-562, and 200 ng p24 ml−1+ for HepG2 cells), either alone or together with cognate IDLVs expressing either CCR5 or AAVS1 ZFNs (1 µg p24 ml−1 for each ZFN-expressing IDLV for B-lymphoblastoid cells and K-562, and 200 ng p24 ml−1 for each ZFN-IDLV for HepG2 cells), and then expanded to perform flow cytometry and molecular analyses (FACSCalibur or FACSCanto II; Becton Dickinson Pharmingen). Single cell–derived clones were obtained by limiting dilution. T lymphocytes were isolated from healthy donors’ peripheral blood mononuclear cells and activated with anti-CD3/CD28–­conjugated beads (Invitrogen) as described previously32. Forty-eight hours after activation, 2.5 × 106 cells ml−1 were co-infected overnight with the ZFNs-expressing Ad5/35 (multiplicity of infection: 1,000) and the indicated donor IDLV (1 µg p24 ml−1 each) and then expanded33 to perform flow cytometry and molecular analyses. EGFP+ and negative cells were sorted by MoFlo XDP Cell Sorter (Beckman Coulter) and stimulated for expansion as described previously34. Four weeks later, DNA and RNA were extracted to perform molecular analyses. NSCs were derived from the telencephalon and diencephalon of a 10.5-week post-conception human fetus and cultured as neurospheres, as described previously35. For transduction, mechanically dissociated neuropheres were plated on matrigel-coated plates (2.5 × 104 cells cm−2) as single cell and incubated overnight with the indicated donor IDLV (37.5 ng p24 ml−1) either alone or together with two AAVS1 ZFN– expressing IDLVs (12.5 ng p24 ml−1 each). Twenty-four hours later, ­adherent cells were enzymatically detached and replated in suspension as single cells for neurosphere formation. Treated cells were expanded as bulk cultures for up to seven ­subculturing passages (90 d). Proliferation and self-renewal were assessed ­establishing growth curves as described previously36,37. For multipotency assays, gene-targeted and untreated neurospheres were mechanically dissociated and plated on Matrigel-coated glass coverslips (2.5 × 104 cells cm−2) in serum-free medium ­containing human recombinant basic fibroblast growth factor (hFGF)-2 (10 ng ml−1; Peprotech). After 5 d, cells were exposed to medium containing 2% FBS and nature methods

human leukemia inhibitory factor (10 ng ml−1; Peprotech) and grown for additional 10 d. Neuronal and glial differentiation was assessed by immunofluorescence using antibodies to lineage­specific markers (Supplementary Table 2). Human iPSCs were derived from human fetal lung fibroblasts (IMR-90; American Type Culture Collecton; ATCC) as described previously30. Briefly, 105 human fibroblasts were transduced twice, with a 24-h interval, with retroviral vectors expressing OCT4, SOX2, c-MYC and KLF4. Cells were then plated on mitomycin C (Sigma) inactivated mouse embryonic fibroblasts and cultured in human embryonic stem (ES) medium for the next 25–30 d, when ESC-like colonies were picked for expansion and characterized as bona fide iPSCs as described previously (Supplementary Fig. 13 and data not shown)30. For single-cell passaging and transduction, iPSCs were treated with the ROCK inhibitor Y27632 (ref. 38) (Sigma). For transduction, single cell–derived iPSCs were incubated overnight with the EF1A_intron-EGFP-pA donor IDLV (300 ng p24 ml−1) either alone or together with two AAVS1 ZFN-expressing IDLVs (400 ng p24 ml−1 each). Treated cells were expanded as bulk cultures for up to 22 subculturing passages (90 d) and analyzed by flow cytometry for EGFP expression at the indicated time points. Single cell–derived EGFP+ clones were obtained from bulk-treated cells by single-cell passaging and analyzed for expression of pluripotency markers by immunofluorescence. For embryoid body (EB) formation, iPSCs were collected by treatment with 1 mg ml−1 collagenase IV for 1 h. The cell clumps were then cultured into a T25 flask for 6 d in human ES medium without hFGF-2. For mesoderm and endoderm differentiation, EBs were plated onto gelatincoated tissue culture dishes at low-density in DMEM (Invitrogen) medium with 20% FBS. Medium was replaced every 2 d. For neural differentiation, 6-d-old EBs were plated onto matrigel-coated dishes in ITSF medium39. Generation of SMA-, SOX17-, PAX6and Nestin-immunopositive cells was detected by immunofluorescence (Supplementary Table 2). Pictures were taken using a Nikon Eclipse 3000 fluorescence microscope. The use of human fetal tissue for the establishment and use of hNSC lines, and the use of human fibroblasts and T lymphocytes were approved by the San Raffaele Scientific Institute Ethical Committee (protocols TIGET-HPCT and TIGET-PERIBLOOD). Gene-targeting analyses. For Southern blot analyses, genomic DNA extracted with the Blood and Cell Culture DNA Midi kit (Qiagen) was digested with the indicated enzymes. Matched DNA amounts were separated on 0.8% agarose gel, transferred to a nylon membrane and probed with the indicated 32P-­radiolabeled sequences. Membranes were exposed in a Storage Phosphor Screen, and Typhoon 9410 densitometric scanning (Amersham Biosciences) was used to quantify band intensity using the ImageQuant TL software. Gene targeting efficiencies were calculated as follows: (intensity of the modified band / (intensity of the wild-type + modified band)) × 100. For qPCR analysis, 200 ng of genomic DNA was analyzed as described previously3. To detect targeted integration in CCR5 or AAVS1, 40–200 ng of genomic DNA was analyzed by PCR using primers indicated in Supplementary Table 3. PCR amplicons were resolved on agarose gel and visualized by ethidium bromide staining. Gene expression analyses. For gene expression analysis, total RNA extracted from 4 × 106 cells using the RNeasy Mini kit (Qiagen) doi:10.1038/nmeth.1674

© 2011 Nature America, Inc. All rights reserved.

was reverse-transcribed using random examers according to SuperScript III First-Strand Synthesis System (Invitrogen) manufacturer’s protocol. We analyzed 50–400 ng of cDNA from T lymphocytes or B lymphoblastoid cells, respectively, in triplicate with TaqMan Gene Expression assays (Applied Biosystems; Supplementary Table 4) in a 7900HT real-time PCR thermal cycler. The SDS 2.2.1 software was used to extract raw data (Ct and raw fluorescence). Genes with a Ct value ≥37 were excluded from the analyses. The relative ­expression level of each gene was calculated by the ∆∆Ct method40, normalized to HPRT and B2M expression (housekeeping gene controls), and represented as fold change relative to the mock-treated samples (calibrator). Real-time PCR Miner software (http://www.miner.ewindup.info/)41 was used to calculate the mean PCR amplification efficiency for each gene, whereas the qBase software program (http://www.biogazelle.com/) was used to measure the relative expression for each gene42. The CCR5 PCR assay amplified the splice junction between exons 2 and 3 upstream of the integration site. The PPP1R12C PCR assay amplified the splice junction between exons 3 and 4 downstream of the integration site. To detect aberrant PPP1R12C fusion transcripts, 200 ng of cDNA from the indicated samples was amplified by PCR using primers specific for exon 1 of the gene and EGFP. PCR amplicons were resolved on agarose gel and visualized by ethidium bromide staining. Chromatin immunoprecipitation and nascent transcripts ­analyses. ChIP was performed as previously described43. Briefly, 2.5 × 106 cells or 5 × 106 cells were fixed with 1% formaldehyde in culture medium for 10 min at 37 °C, formaldehyde was quenched with 125 mM glycine for 2 min at room temperature (25–26 °C), and cells were collected by centrifugation and resuspended in membrane lysis buffer for 10 min on ice. Nuclei were then collected by centrifugation, resuspended in nuclear lysis buffer and sonicated at high power for 20 × 30 s pulses (30 s pause between pulses) in a Bioruptor UCD-200 (Diagenode) to generate average DNA fragments of ~300 bp. The sonicated lysate was clarified by centrifugation and either stored for subsequent procedures (input) or incubated overnight on a rotating platform at 4 °C with 50 µl of protein G magnetic beads (Dynabeads G; Invitrogen) that were pre-coated or not with 5–10 µg of ChIP-grade antibodies (Abcam) to human histone H3 or its indicated post-translation modifications (Supplementary Table 5). IgG isotypes were also used as controls. Immunoprecipitated chromatin eluted from the beads and Input material were reverse cross-linked overnight at 65 °C, DNAs purified using the QIAquick PCR Purification kit (Qiagen) and then eluted in 70 µl TE (pH 7.5). For real-time PCR analyses, 1–2 µl of DNA were amplified in duplicate using the Power SYBR Green PCR reaction mix (Applied Biosystems) and a primer set specific for each of the investigated genomic sites (Supplementary Table 6) in a 7900HT real-time PCR thermal cycler. The SDS 2.2.1 software was used to extract raw data (Ct and raw fluorescence). The ­percentage of enrichment of each histone modification for each investigated site was calculated by the ∆Ct method using the Input as normalizer. Chromatin-associated nascent transcripts were isolated as previously described44,45. Briefly, 5 × 106 cells were resuspended in 400 µl of 1.5 mM HEPES buffer (pH 7.9) with 0.3 M sucrose and lysed by the addition of 0.8% NP-40 in 400 µl HEPES buffer with 0.3 M sucrose. Nuclei were collected through a 3 ml cushion of HEPES buffer 0.9 M sucrose by a 30-min ­centrifugation at 2,500g and resuspended in 100 µl of nuclear suspension buffer. doi:10.1038/nmeth.1674

Nuclei were then lysed on ice for 10 min by adding 750 µl of nuclear lysis buffer. Chromatin was collected by 10-min centrifugation at 12,300g at 4 °C. Nascent RNAs were extracted by RNeasy plus mini kit (Qiagen), and transcripts were reverse transcribed using either random hexamer primers or gene-specific primers (Supplementary Table 6) and Superscript III (Invitrogen). Realtime PCRs were performed with a ViiA 7 Real-Time PCR System (Applied Biosystems). Serial dilutions of genomic DNA were used as standard curve to quantify reverse-transcribed RNA. PCR amplifications were peformed in triplicate. Statistical analysis. One-way ANOVA test with Bonferroni’s multiple comparison post-test was used to assess statistical significance of differences in gene expression among all samples (P < 0.05). In post-hoc analyses evaluating differences among all samples, a very high expression in one sample (that is, SFFV promoter targeted into the CCR5 gene) may hinder the detection of samples showing a much lower level of overexpression (that is, PGK promoter targeted into the CCR5 gene). Thus, to detect all the overexpressed samples, we analyzed each sample individually by performing Anova test among mock, EGFP− and that EGFP+ sample. Based on the variance in B2M and hypoxanthine guanine phosphoribosyl transferase (HPRT) expression among all samples in the analysis (n = 200), fold changes in the test gene comprised between 0.6 and 1.4 were not considered relevant. 30. Takahashi, K. et al. Induction of pluripotent stem cells from adult human fibroblasts by defined factors. Cell 131, 861–872 (2007). 31. Miller, J.C. et al. An improved zinc-finger nuclease architecture for highly specific genome editing. Nat. Biotechnol. 25, 778–785 (2007). 32. Bondanza, A. et al. Suicide gene therapy of graft-versus-host disease induced by central memory human T lymphocytes. Blood 107, 1828–1836 (2006). 33. Kaneko, S. et al. IL-7 and IL-15 allow the generation of suicide genemodified alloreactive self-renewing central memory human T lymphocytes. Blood 113, 1006–1015 (2009). 34. Riddell, S.R. & Greenberg, P.D. The use of anti-CD3 and anti-CD28 monoclonal antibodies to clone and expand human antigen-specific T cells. J. Immunol. Methods 128, 189–201 (1990). 35. Vescovi, A.L. et al. Isolation and cloning of multipotential stem cells from the embryonic human CNS and establishment of transplantable human neural stem cell lines by epigenetic stimulation. Exp. Neurol. 156, 71–83 (1999). 36. Vescovi, A.L. & Snyder, E.Y. Establishment and properties of neural stem cell clones: plasticity in vitro and in vivo. Brain Pathol. 9, 569–598 (1999). 37. Neri, M. et al. Efficient in vitro labeling of human neural precursor cells with superparamagnetic iron oxide particles: relevance for in vivo cell tracking. Stem Cells 26, 505–516 (2008). 38. Watanabe, K. et al. A ROCK inhibitor permits survival of dissociated human embryonic stem cells. Nat. Biotechnol. 25, 681–686 (2007). 39. Roy, N.S. et al. Functional engraftment of human ES cell-derived dopaminergic neurons enriched by coculture with telomerase-immortalized midbrain astrocytes. Nat. Med. 12, 1259–1268 (2006). 40. Pfaffl, M.W. A new mathematical model for relative quantification in real-time RT-PCR. Nucleic Acids Res. 29, e45 (2001). 41. Zhao, S. & Fernald, R.D. Comprehensive algorithm for quantitative realtime polymerase chain reaction. J. Comput. Biol. 12, 1047–1064 (2005). 42. Hellemans, J., Mortier, G., De Paepe, A., Speleman, F. & Vandesompele, J. qBase relative quantification framework and software for management and automated analysis of real-time quantitative PCR data. Genome Biol. 8, R19 (2007). 43. Lee, T.I., Johnstone, S.E. & Young, R.A. Chromatin immunoprecipitation and microarray-based analysis of protein location. Nat. Protoc. 1, 729–748 (2006). 44. Wuarin, J. & Schibler, U. Physical isolation of nascent RNA chains transcribed by RNA polymerase II: evidence for cotranscriptional splicing. Mol. Cell Biol. 14, 7219–7225 (1994). 45. Masternak, K., Peyraud, N., Krawczyk, M., Barras, E. & Reith, W. Chromatin remodeling and extragenic transcription at the MHC class II locus control region. Nat. Immunol. 4, 132–137 (2003).

nature methods