Transcriptional Regulation of Nanog by OCT4 and SOX2*

6 downloads 1083 Views 297KB Size Report
Mar 8, 2005 - tify a composite sox-oct cis-regulatory element essential for Nanog pluripotent transcription. This element is conserved over 250 million years of ...
THE JOURNAL OF BIOLOGICAL CHEMISTRY © 2005 by The American Society for Biochemistry and Molecular Biology, Inc.

Vol. 280, No. 26, Issue of July 1, pp. 24731–24737, 2005 Printed in U.S.A.

Transcriptional Regulation of Nanog by OCT4 and SOX2* Received for publication, March 8, 2005, and in revised form, April 25, 2005 Published, JBC Papers in Press, April 27, 2005, DOI 10.1074/jbc.M502573200

David J. Rodda‡§, Joon-Lin Chew§¶!**, Leng-Hiong Lim‡§!, Yuin-Han Loh¶!, Bei Wang‡, Huck-Hui Ng ¶!‡‡, and Paul Robson‡!§§ From the ‡ Stem Cell and Developmental Biology and ¶Gene Regulation Laboratory, Genome Institute of Singapore, Singapore 138672 and the ! Department of Biological Sciences, National University of Singapore, Singapore 117543

Nanog, Sox2, and Oct4 are transcription factors all essential to maintaining the pluripotent embryonic stem cell phenotype. Through a cooperative interaction, Sox2 and Oct4 have previously been described to drive pluripotent-specific expression of a number of genes. We now extend the list of Sox2-Oct4 target genes to include Nanog. Within the Nanog proximal promoter, we identify a composite sox-oct cis-regulatory element essential for Nanog pluripotent transcription. This element is conserved over 250 million years of cumulative evolution within the eutherian mammals. A Nanog proximal promoter-EGFP (enhanced green fluorescent protein) reporter transgene recapitulates endogenous Nanog mRNA expression in embryonic stem cells and their differentiated derivatives. Sox2 and Oct4 interaction with the Nanog promoter was confirmed through mutagenesis and in vitro binding assays. Electrophoretic mobility shift assays indicate that the Sox2-Oct4 heterodimer forms more efficiently on the composite element within Nanog than the similar element within Fgf4. Using chromatin immunoprecipitation, we show that Oct4 and Sox2 bind to the Nanog promoter in living mouse and human embryonic stem cells. Furthermore, by specific knockdown of Oct4 and Sox2 mRNA by RNA interference in embryonic stem cells, we provide genetic evidence for a link between Oct4, Sox2, and the Nanog promoter. These studies extend the understanding of the pluripotent genetic regulatory network within which the Sox2-Oct4 complex are at the top of the regulatory hierarchy.

Nanog is a homeobox-containing transcription factor with an essential function in maintaining the pluripotent cells of the inner cell mass and in the derivation of embryonic stem cells (ESCs)1 from these (1). Furthermore, overexpression of Nanog * This work was supported in part by grants from A*Star. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. § These authors contributed equally to this work. ** Supported by the Singapore Millennium Foundation. ‡‡ To whom correspondence may be addressed: Genome Institute of Singapore, 60 Biopolis St., #02-01, Genome Bldg., Singapore 138672. Tel.: 65-6478-8145; Fax: 65-6478-9004; E-mail: [email protected]. edu.sg. §§ To whom correspondence may be addressed: Genome Institute of Singapore, 60 Biopolis St., #02-01, Genome Bldg., Singapore 138672. Tel.: 65-6478-8152; Fax: 65-6478-9005; E-mail: [email protected]. edu.sg. 1 The abbreviations used are: ESC, embryonic stem cell; EMSA, electrophoretic mobility shift assay; GFP, green fluorescent protein; EGFP, enhanced GFP; RA, retinoic acid; RNAi, RNA interference; LIF, leukemia inhibitory factor; FACS, fluorescence-activated cell sorting; ChIP, chromatin immunoprecipitation; ES, embryonic stem; PSBP, pluripotent cell-specific Sox element-binding protein. This paper is available on line at http://www.jbc.org

is capable of maintaining the pluripotency and self-renewing characteristics of ESCs under what normally would be differentiation-inducing culture conditions (2). Concomitant with this essential function in pluripotent cell maintenance is its restricted expression pattern. Nanog transcripts first appear in the inner cells of the morula prior to blastocyst formation (1, 2), are restricted to the inner cell mass in the blastocyst (3), and are no longer detectable at implantation. Expression of Nanog reappears in the proximal epiblast at embryonic day 6 and remains restricted to the epiblast as development progresses (4). The factors controlling expression of this gene have yet to be described. The POU domain-containing Oct4 and the HMG domaincontaining Sox2 are two other transcription factors known to be essential for normal pluripotent cell development and maintenance (5, 6). Although both have independent roles in determining other cell types (6, 7), at least part of their function in pluripotent cells is via a synergistic interaction between the two to drive transcription of target genes. Currently known targets of Sox2-Oct4 synergy are Fgf4, Utf1, and Fbx15, as well as Sox2 and Pou5f1 (the gene encoding Oct4) themselves (8 – 13). Each of these target genes has a composite element containing an octamer and a sox binding site. Our recent characterization of a genetic link between the Sox2-Oct4 complex and Sox2 and Pou5f1 expression, as well as their in vivo binding to these genes in mouse and human ESCs (12), suggests that this complex is at the top of the pluripotent cell genetic regulatory network. In this study, we were interested in identifying the cisregulatory module responsible for the pluripotent-specific expression of Nanog. Nanog represents the first known transcription factor appearing after compaction, and specific to the inner cells of the morula, both Oct4 and Sox2 are expressed prior to compaction in all blastomeres (6, 14). As the expression of Nanog precisely corresponds to the pluripotent phenotype, we reasoned that a molecular understanding of the regulation of this gene would provide further insight into pluripotency. Here we describe a cis-regulatory module conserved throughout the eutherian mammals that is capable of recapitulating endogenous Nanog expression; at the core of this module is a composite oct-sox element necessary for pluripotent expression. MATERIALS AND METHODS

Sequence Analysis and Promoter Constructs—All sequences were from public data bases at www.ncbi.nlm.nih.gov and/or www.ensembl. org. A genomic sequence corresponding to Homo sapiens NANOG (15) was used. Bos taurus (cow) and Loxodonta africana (elephant) sequences were constructed from trace files available at NCBI. Mus musculus (mouse) and Rattus norvegicus (rat) sequences were from the respective compiled genomic sequences. The promoter region of mouse Nanog was amplified from genomic DNA. Primers for amplification, with restriction sites for cloning purposes indicated in lowercase, were: forward, 5!-CGCgtcgacTAAAGTGAAATGAGGTAAAGCC-3!, and reverse, 5!-CGCggatccGGAAAGATCATAGAAAGAAGAG-3!. The amplified product was cloned into pGL3-Basic (Promega) and pEGFP1 (Clontech) vectors and sequence-verified.

24731

24732

Transcriptional Regulation of Nanog by Oct4 and Sox2

Embryonic Stem Cell Culture and Reporter Lines—E14 mouse ESCs were grown in Dulbecco’s modified Eagle’s medium, 20% fetal bovine serum, 1" non-essential amino acids, 0.1 mM 2-mercaptoethanol, and an aliquot of recombinant LIF conditioned medium. ESCs were stably transfected with the NanogEGFP construct using a standard protocol, and individual colonies were picked after selection with 300 !g/ml G418 for 10 days with cells grown on neomycin-resistant mouse embryonic fibroblasts. Differentiation of ESCs was by withdrawal of LIF-conditioned medium, spontaneous differentiation into embryoid bodies, or the addition of retinoic acid. Fluorescence-activated cell sorting (FACS) was on a FACSCalibur (BD Biosciences). Quantitation of endogenous Nanog and enhanced green fluorescent protein (EGFP) reporter expression was by reverse transcription-PCR analyses in real time using the ABI PRISM 7900 sequence detection system (Applied Biosystems). Proligo-synthesized primer-probe sets for these were as follows: Nanog forward, 5!-GGTTGAAGACTAGCAATGGTCTGA-3!; Nanog reverse, 5!-TGCAATGGATGCTGGGATACTC-3!; Nanog probe, 5!-TTCAGAAGGGCTCAGCACCA-3!; EGFP forward, 5!-CGACAACCACTACCTGAGCAC-3!; EGFP reverse, 5!-TCGTCCATGCCGAGAGTGAT-3!; EGFP probe, 5!-CGGCGGCGGTCACGAACTCCAGC-3!. Expression was normalized to a "-actin control (Applied Biosystems). The human ESC line HUES-6 (obtained from Doug Melton, Harvard University) was passaged according to Cowan et al. (16). Luciferase Reporter Assays—F9 cells (ATCC) were cultured in Dulbecco’s modified Eagle’s medium with high glucose (Invitrogen), 15% standard fetal bovine serum (Hyclone), and 1% penicillin/streptomycin and maintained at 37 °C with 5% CO2. DNA transfection was by Lipofectamine 2000 (Invitrogen) following the supplied protocol. A Renilla luciferase plasmid (pRL-TK from Promega) was co-transfected as an internal control. Cells were harvested after 24 h, and the luciferase activity of the lysate was measured with the Dual-Luciferase reporter assay system (Promega) using the Centro LB960 96-well luminometer (Berthold Technologies). At a minimum, transfections were done in duplicate and on two independent occasions. Reporter plasmids were modified using the Transformer site-directed mutagenesis kit (Clontech) to incorporate 3-bp mutations which were subsequently verified by sequencing. Electrophoretic Mobility Shift Assays (EMSA)—Nuclear extracts were prepared from E14 mouse ESCs grown on mouse embryonic fibroblast feeders using the method of Dignam et al. (17) with modifications as described. Cells were washed and harvested by scraping in phosphate-buffered saline, resuspended in 5 pellet volumes of buffer A (10 mM Hepes, pH 7.9, 1.5 mM MgCl2, 10 mM KCl, 0.5 mM dithiothreitol, 1% protease inhibitor mixture (Sigma)), and incubated on ice for 10 min. Cells were resuspended in 2 pellet volumes of buffer A and lysed with a Dounce homogenizer. Pelleted nuclei were resuspended in 0.6 pellet volumes of buffer C (20 mM Hepes, pH 7.9, 25% glycerol, 420 mM NaCl, 1.5 mM MgCl2, 0.2 mM EDTA, 0.5 mM dithiothreitol, 1% protease inhibitor mixture) and incubated at 4 °C with rotation for 30 min. After centrifugation supernatants were dialyzed against dialysis buffer (20 mM Hepes, pH 7.9, 20% glycerol, 100 mM KCl, 0.83 mM EDTA, 1.66 mM dithiothreitol, 1% protease inhibitor mixture) at 4 °C for 2 h. Extracts were then stored at #80 °C. For EMSA, double-stranded DNA oligonucleotides (Proligo) labeled with cy5 at the 5! termini of both strands were used. Sense strand sequences were as follows: FOS, 5!-TTTAAGTATCCCATTAGCATCCAAACAAAGAGTTTTC-3!; NOS, 5!-CTTACAGCTTCTTTTGCATTACAATGTCCATGGTGGA-3!, NmOS, 5!-CTTACAGCTTCTTTCAAATTACAATGTCCATGGTGGA-3!; NOmS, 5!-CTTACAGCTTCTTTTGCATTAACCTGTCCATGGTGGA-3!; NmOmS, 5!-CTTACAGCTTCTTTCAAATTAACCTGTCCATGGTGGA-3!; NNS, 5!-CTGCAGGTGGGATTAACTGTGAATTCA-3!. For DNA binding reactions, 2 !l ($16 !g) of nuclear extract was added to a 10-!l reaction (final) containing 50 nM cy5 oligonucleotide and 5 !g of poly(dGdC) (Amersham Biosciences). The final binding buffer composition was 60% dialysis buffer. Where specified, 1 !M unlabeled double-stranded competitor was also included prior to the addition of nuclear extracts. Binding reactions were incubated for 20 min at room temperature. Where specified, antibodies (Santa Cruz Biotechnology) were added after the initial incubation for a further 20 min as follows: 2 !l of anti-Oct4 (sc-9081x) and anti-JunB (sc-46x) or 8 !l of anti-Sox2 (sc-17320) and anti-Sox4 (sc-17326). Binding reactions were resolved on pre-run 6% native PAGE gels (18.5 " 20 cm) in 0.5" Tris-borate-EDTA for 2 h at 300 V. Gels were imaged directly in glass plates using an Amersham Biosciences Typhoon 9140 PhosphorImager. EMSA performed with mouse embryonic fibroblast feeders alone produced no significant mobility shifts, indicating that they did not contribute to the observed protein-DNA complexes. Chromatin Immunoprecipitation (ChIP) Assay—ChIP assays with

E14 mouse ESCs were carried out as described (18). Briefly, cells were cross-linked with 1% formaldehyde for 10 min at room temperature, and formaldehyde was inactivated by the addition of 125 mM glycine. Chromatin extracts containing DNA fragments with an average size of 500 bp were immunoprecipitated using Oct4 (N19) or Sox2 (Y17) polyclonal antibodies (Santa Cruz Biotechnology) or a Sox2 (AB5603) polyclonal antibody (Chemicon). For all ChIP experiments, quantitative PCR analyses were performed in real time using the ABI PRISM 7900 sequence detection system (Applied Biosystems) and SYBR Green master mix (Applied Biosystems) as described (19). Relative occupancy values were calculated by determining the apparent immunoprecipitate efficiency (ratios of the amount of immunoprecipitated DNA over that of the input sample) and normalized to the level observed at a control region, which was defined as 1.0. Primer pairs were as follows: 1, 5!-GGCAAACTTTGAACTTGGGATGTGGAAATA-3!, 5!-CTCAGCCGTCTAAGCAATGGAAGAAGAAAT-3!; 2, 5!-GAGGATGCCCCCTAAGCTTTCCCTCCC-3!, 5!-CCTCCTACCCTACCCACCCCCTATTCTCCC-3!; 3, 5!-GGGTCACCTTACAGCTTCTTTTGCATTA-3!, 5!-GGCTCAAGGCGATAGATTTAAAGGGTAG-3!; 4, 5!-GGTGATACGTTGGCCTTCTAGTCTGAA-3!, 5!-GGGCAAATTGCAAACTAACTGTATAACCTC-3!. RNAi Analysis—E14 mouse ESCs were seeded into 96-well plates at 20,000 –30,000 cells/well density (with primary embryonic fibroblasts at 1000 cells/well) 1 day prior to transfection. Transfection was performed with Lipofectamine 2000. 50 ng of Nanog-pGL3 (firefly luciferase construct), 150 ng of pSuper or pSuper_RNAi (for further details, see Ref. 12), and phRLSV40 (normalization control expressing Renilla luciferase) were transfected. Twenty-four hours after transfection, cells were selected for puromycin resistance for 2 further days prior to the luciferase activity being measured. The sequences used for RNAi were: Sox2, 5!-GAAGGAGCACCCGGATTA-3!; Oct4, 5!-GAAGGATGTGGTTCGAGT-3!. RESULTS

Sequence Comparisons Identify Conserved Non-coding Sequences—As Nanog is expressed in both mouse and human pluripotent cells, we reasoned the pluripotent transcription of this gene is maintained through the functional conservation of cis-regulatory elements, and concomitantly, the location and sequence of these elements would be conserved through purifying selection. Therefore we extracted mouse and human Nanog genomic sequences from the public databases. A pairwise alignment of the mouse and human gene sequences was generated utilizing the Vista on-line tool at LBL Berkley (20). We aligned the genomic region from 10 kb upstream of the most 5! mouse Nanog EST to 10 kb downstream of the most 3! EST. Peaks of sequence similarity corresponded to the four exons of Nanog, most significantly in the homeodomain-encoding exons 2 and 3 (data not shown). Of the 26.1 kb of non-coding sequence that we compared, the only area of significant sequence conservation was a 200-bp region immediately upstream of the 5!-most EST. This focused our attention on this region for further study as the sequence conservation suggested functionally conserved cis-regulatory elements. Proximal Promoter Is Sufficient to Recapitulate Endogenous Expression—To functionally test this conserved proximal promoter region, we generated stably transfected mouse ESC lines with a plasmid vector containing the mouse sequence from #289 to %117 (relative to the transcription start site) driving the expression of an EGFP. We had defined the Nanog transcription start site as the position of the furthest 5! public EST; this has since been defined as the predominant transcription start site (4). After G418 selection for stably transfected ES lines, the majority of colonies fluoresced green, indicating that this region of the Nanog promoter is indeed active in pluripotent cells (Fig. 1B). A number of these NanogEGFP colonies were selected for further analysis. In addition to the pluripotent-positive cis-regulatory elements within this promoter fragment, we were interested in determining whether this promoter contained sufficient cisregulatory information to down-regulate transcription upon ESC differentiation as occurs with endogenous Nanog. There-

Transcriptional Regulation of Nanog by Oct4 and Sox2

24733

FIG. 1. The Nanog proximal promoter drives EGFP expression in undifferentiated ESCs but not in their differentiated derivatives. A mouse ESC line (NanogEGFP) stably transfected with a construct containing the Nanog promoter (#289 to %117 relative to the transcription start site) driving EGFP expression was compared under different culture conditions. Light (A, C, and E) and fluorescence (B, D, and F) microscopy are shown of undifferentiated ESCs (A and B), ESCs cultured without LIF for 3 days (C and D), and ESCs treated with 0.1 !M RA for 3 days (E and F). G, real-time PCR detection of transcript levels of endogenous Nanog as compared with that of the Nanog promoter-driven EGFP under undifferentiated and differentiating conditions. Samples were amplified in triplicate and expressed relative to a "-actin control. Standard deviations are indicated. Numbers within the graph indicate percentage of EGFP-positive cells as measured by FACS analysis. EB, embryoid body.

fore we compared EGFP expression in the undifferentiated NanogEGFP ESCs to their differentiated derivatives (Fig. 1). Retinoic acid (RA), a strong inducer of ESC differentiation, drastically reduced EGFP expression after 3 days of treatment (Fig. 1, E and F). Culturing NanogEGFP cells in the absence of LIF, a less aggressive differentiation protocol than RA, reduced EGFP expression noticeably after 3 days of culture (Fig. 1, C and D). This expression pattern was seen in three independent NanogEGFP lines. To directly compare transcripts from this NanogEGFP to that of the endogenous Nanog itself, we used real-time PCR (Fig. 1G). The EGFP expression closely paralleled that of the endogenous Nanog as indicated in comparisons of embryoid bodies of 2, 4, 6, and 8 days as well as 2 and 4 days after induction with RA or exclusion of LIF from the medium. In addition, the percentage of EGFP-positive cells, as determined by FACS analysis, in this NanogEGFP reporter line under differing differentiation conditions correlated well with endogenous Nanog expression (Fig. 1G). For instance, the undifferentiated ESCs were 74% EGFP-positive, whereas the 8-day embryoid bodies and 4-day RA-treated cells were 3 and 9% EGFP-positive, respectively. These results indicate that this 406-bp fragment of Nanog contains sufficient cis-regulatory information to recapitulate endogenous Nanog expression, at least as tested in an in vitro ESC-based system. Phylogenetic Footprinting Identifies an Oct-Sox Composite Element—We next aimed to identify the specific cis-regulatory elements responsible for driving pluripotent expression. Generating an alignment of the proximal promoter region from a diverse range of mammals enabled us to identify a strong phylogenetic footprint. A very conservative estimate of the divergence time between the five species used in constructing this alignment (12 million years ago for the mouse-rat split and 60 million years ago between each of the mouse-rat, human, cow, and elephant) provides a cumulative 252 million years of purifying selection to identify a footprint. The greatest stretch of sequence conservation was over a 94-bp region located from position #212 to #119 relative to the Nanog transcription start site; within this, 64 positions were invariant (Fig. 2A). A 16-bp stretch represented the longest uninterrupted invariant sequence and, intriguingly, within this was a 15-bp oct-sox composite element (Fig. 2A) similar to those identified

in and known to be functional for pluripotent expression of Fgf4, Utf1, Sox2, Pou5f1, and Fbx15 (8 –12). The position of the sox and oct elements relative to each other is the same in all five of these genes, which is indicative of the specific proteinprotein contacts between the corresponding transcription factors (Sox2 and Oct4) known to bind this composite element (21, 22). Similar to Utf1, Sox2, Pou5f1, and Fbx15, the sox and oct cis-elements in Nanog are immediately adjacent to one another, whereas in Fgf4, 3 bases separate the two elements. Such strong purifying selection at numerous positions within the Nanog promoter suggests a functional role for these positions in the regulation of transcription. Some of these positions, such as the oct-sox composite element, are likely to be important in pluripotent expression, whereas others may represent cis-regulatory elements functional in the post-implantation expression of Nanog. Functional Analysis of the Conserved cis-Regulatory Module—To study the role of specific sequence elements in the activation of pluripotent cell transcription, we constructed a series of luciferase reporter constructs and transfected these into F9 teratocarcinoma cells, a mouse cell line representative of the pluripotent phenotype. The wild-type 406-bp promoter (#289 to %117) from the NanogEGFP construct had significant transcriptional activity in this system; its activity was set arbitrarily to 100% for all further comparisons (Fig. 2B, wt). A shorter region (#230 to %63) still inclusive of the conserved sequence in Fig. 2A had $130% activity of the original promoter. This suggests a potential negative regulatory element contained within the #289 to #231 region that is excluded from this shorter promoter but also indicates the presence of pluripotent activator elements within the #230 to %63 span. To test the orientation dependence of the conserved module, we made identical constructs but with a region (spanning #289 to #94) containing this module in either forward or reverse orientation relative to the immediate proximal promoter (#93 to %117). Restriction enzyme sites introduced when creating these two constructs slightly altered the sequence from that of the wild-type control. 79% of wild-type activity was maintained in the forward construct, and 48% was maintained in the reverse construct. These luciferase data are summarized from a total of four different experiments with three wells of cells/ construct/experiment. Although significant transcriptional ac-

24734

Transcriptional Regulation of Nanog by Oct4 and Sox2

FIG. 2. Phylogenetic footprint and promoter-luciferase reporter assays. A, an alignment of Nanog sequences from mouse, rat, human, cow, and elephant. Positions, in the region #212 to #119 relative to the transcription start site, that remain invariant over &250 million years of cumulative evolution are shaded. The oct-sox composite element is indicated, as are the 3-bp replacement mutations with their corresponding names given below. B, luciferase reporter assays of various Nanog promoter constructs transiently transfected into F9 teratocarcinoma cells. The luciferase activity of the #289 to %117 mouse Nanog promoter fragment (wt) is arbitrarily set at 100% and is compared with the activity of altered mouse promoter constructs. The key is as follows: sh.wt, short wild-type promoter (#230 to %63); For and Rev, identical to one another but for the conserved region (#93 to %117) in the reverse orientation relative to the transcription start site; M1, M2, Oct, Sox, M3, and M4, 3-bp substitution mutations described in A; O/S, mutation altering both the Oct and the Sox sites.

tivity was maintained in the reverse orientation, it did consistently show a lower transcriptional activity than the forward (wild-type) orientation, which suggests that the function of this conserved module has subtle orientation effects. The Oct and Sox cis-Elements Are Required for Nanog Promoter Activity—To test their effect on pluripotent transcriptional activity, 3-bp mutations were generated within this core conserved region (Fig. 2A). The M2, M3, and M4 mutations, cumulatively covering seven invariable positions of the 9 bp they span, did not greatly alter transcriptional activity (Fig. 2B). The M3 mutation did not decrease promoter activity, and the M2 and M4 mutations dropped luciferase activity down to $65% of wild-type. These subtle effects on promoter activity may indeed be significant in the context of the in vivo pluripotent cell. It is also possible that these positions covered by the M2, M3, and M4 mutations have some other functional role (as they are conserved) other than activation of transcription in pluripotent cells, possibly in the repression of expression in a Nanog-negative cell type or activation of transcription in Nanog-positive cells that occur later in development. As anticipated, mutations affecting the oct and sox elements dropped promoter activity to 17 and 15%, respectively (Fig. 2B); additionally, when both of these sites were mutated in the same construct, activity was reduced to 6% of wild-type. A conserved sequence 20 bp 5! to the oct-sox composite element also had a very drastic effect on transcription (Fig. 2B, M1), lowering it to 22% of wild type. This suggests that the oct and sox cis-regulatory elements are not the only elements important in pluripotent transcription of Nanog and highlights the utility of phylogenetic footprinting for identifying functionally important cis-regulatory elements. However, our subsequent analysis focused on the sox and oct elements themselves. Sox2 and Oct4 Bind the Composite cis-Element—To determine whether Oct4 and Sox2 were able to recognize the composite element within the Nanog promoter, we performed EMSA using a 37-bp probe (NOS) spanning this sequence combined with nuclear extracts from mouse ESCs. The Fgf4 enhancer element (FOS) known to bind Oct4 and Sox2 in EMSA (see Ref. 8) was used as a positive control. Four major proteinDNA complexes (A–D) were observed using the NOS probe

(Fig. 3). Similar complexes appeared on the FOS probe with one additional complex (E). The addition of antibodies to either Oct4 or Sox2 resulted in supershifts of complexes A and C or C and E, respectively, indicating that complex A represents binding of the Oct4 monomer; complex E represents the Sox2 monomer, and complex C represents the Oct4/Sox2 heterodimer. Antibody specificity was confirmed as antibodies against JunB or Sox4 had no effect. Significantly, Sox2 in E14 extracts was not observed binding as a monomer to NOS. Following transient transfection of Sox2 into NIH 3T3 cells, we observed that Sox2 did bind NOS as a monomer in EMSA (not shown), indicating that Sox2 is capable of binding the Nanog element in the absence of Oct4. This suggests that all available Sox2 in the E14 extracts binds as a heterodimer with the more abundant Oct4 and that heterodimer formation is favored on the Nanog element relative to the Fgf4 element. This is corroborated by the observation that the ratio of the intensity of the heterodimer complex (C) to that of the Oct4 monomer complex (A) was consistently greater on NOS as compared with FOS (Fig. 3). We also observed a consistently faster mobility of the Oct4/ Sox2 heterodimer binding NOS as compared with that binding FOS. This is compatible with the idea that a more closed conformation could occur on the NOS sequence due to the lack of nucleotides separating the oct and sox elements, as occurs in the FOS sequence. It is tempting to speculate that such distinct conformations could result in a differential recruitment of coregulatory factors and a differential transcriptional activity. Competitions with unlabeled double-stranded oligonucleotides established the DNA binding specificity of Oct4 and Sox2 on the Nanog element. First, binding by all factors to NOS or FOS was strongly competed by the addition of a 20-fold excess of the identical unlabeled oligonucleotide, whereas a nonspecific unlabeled oligonucleotide containing a downstream sequence from the Nanog promoter (NNS) did not compete (Fig. 3). Further indication that similar complexes were forming on the two elements was evidenced from the ability of the unlabeled FOS to compete with binding to labeled NOS and vice versa. To further characterize the binding specificity of Oct4 and Sox2, we used competitors in which the Nanog sequence

Transcriptional Regulation of Nanog by Oct4 and Sox2

FIG. 3. Oct4 and Sox2 form heterodimers on the Nanog octamer/HMG composite element. EMSA was performed using either the labeled putative Oct4/Sox2 element from the Nanog promoter or the known Oct4/Sox2 binding site from the Fgf4 enhancer as a positive control. We tested binding of factors in crude nuclear extracts of E14 embryonic stem cells. Protein-DNA complexes containing Oct4 and Sox2 were identified by the addition of antibodies (Ab) that supershifted (ss) the indicated bands. Binding specificity was tested using oligonucleotide (oligo) competitors (comp.), which indicated that binding of Oct4 required an intact octamer motif and binding of Sox2 required an intact HMG element. Heterodimerization required both to be intact. The key to competitors is as follows: N, Nanog; F, Fgf4; O, Oct4 binding site; S, Sox2 binding site; mO, mutated Oct4 binding site; mS, mutated Sox2 binding site; NNS, nonspecific competitor.

contained 3-bp substitutions (as in Fig. 2A) in the oct (NmOS), sox (NOmS), or both (NmOmS) elements. Unlabeled NmOS was unable to compete effectively for binding of the Oct4 monomer or the Sox2-Oct4 heterodimer on either NOS or FOS, whereas Sox2 binding was competed on the FOS element. Likewise, the Sox2 monomer and the heterodimer were able to bind both NOS and FOS in the presence of NOmS, whereas binding of the Oct4 monomer was competed. Moreover the competitor with substitutions in both elements, NmOmS, did not effectively compete for binding of Oct4, Sox2, or the heterodimer to either NOS or FOS. Taken together, these results establish that binding of Oct4 requires an intact octamer motif, binding of Sox2 requires an intact sox element, and binding of the heterodimer requires that both be intact. It is noteworthy that both NmOS and NOmS partially competed for heterodimer formation as would be predicted since they bind the Sox2 and Oct4 monomers, respectively. Significantly, following disruption of the heterodimer on NOS by competition of Oct4 binding by NOmS, the Sox2 monomer was able to bind NOS. This further establishes that Sox2 can bind the Nanog element as a monomer and that in the absence of competitor, all of the available Sox2 in E14 extracts binds this element as a heterodimer with Oct4, suggesting that heterodimer formation is favored on this element. Oct4 and Sox2 Bind to the Nanog Promoter in Vivo and Regulate Its Activity—To confirm that Oct4 and Sox2 do interact with the Nanog promoter in vivo, we performed ChIP experiments with Oct4 and Sox2 antibodies and nuclear extracts from mouse and human ESCs and in mouse ESCs differentiated with retinoic acid. From undifferentiated mouse ESCs grown in feeder-free conditions, DNA fragments containing the composite oct-sox element were enriched up to 43- and 37-fold when immunoprecipitated with the Oct4 and Sox2 antibodies, respectively (Fig. 4A, Amplicon 2). Two neighboring regions that did not contain the oct-sox composite element were not significantly enriched (Fig. 4A, Amplicons 1 and 3). Furthermore, upon retinoic acid-induced differentiation of mouse ESCs, this enrichment of the oct-sox-containing fragments was

24735

FIG. 4. In vivo interaction of Oct4/OCT4 and Sox2/SOX2 with the Nanog/NANOG promoter. A, ChIP was performed on undifferentiated mouse ESCs (0D RA) and ESCs cultured in the presence of RA for 3 and 6 days. Immunoprecipitation with antibodies against Oct4 (N19) and Sox2 (Y17) are shown. Fold enrichment measured by real time PCR, of three different amplicons, were compared; one of these (Amplicon 2) encompasses the composite oct-sox element (open box in schematic), and the two others (Amplicons 1 and 3) do not. The position, relative to the transcription start site, of each amplicon is indicated in the schematic diagram, and exon 1 of Nanog is indicated by a solid black box. B, OCT4 and SOX2 ChIP of NANOG in human ESCs. The fold enrichment of specific amplicons from undifferentiated cells is indicated. The N19 antibody against OCT4 and the AB5603 antibody against SOX2 were used. All standard deviations are indicated.

reduced corresponding to the degree of differentiation. After 3 days of differentiation, enrichment only reached a maximum of 10-fold above background with both Oct4 and Sox2 antibodies, and after 6 days of differentiation, no significant enrichment was detectable (Fig. 4A). Using an antibody to Mll as a negative control, there was no significant enrichment for any of the amplicons from any of the three ESC states analyzed (data not shown). The level of enrichment identified here corresponds well with the activity of the Nanog promoter in a similar differentiation protocol described above with the EGFP reporter line. To establish that OCT4 and SOX2 also interact with the NANOG promoter in human ESCs, we performed a similar ChIP analysis on the human ESC line HUES-6, these grown on inactivated mouse embryonic fibroblasts. We used six different amplicons (Fig. 4B), all within close proximity to exon 1 of NANOG, to detect for enrichment of DNA chromatin-immunoprecipitated with OCT4 and SOX2 antibodies. Only the three closest amplicons to the composite sox-oct element showed significant enrichment with both antibodies (Fig. 4B). A second Sox2 antibody (Y17) gave similar results (data not shown). The lower fold enrichment for all three antibodies ($6-fold) as compared with that seen in mouse ESCs may be due to the presence of more differentiated cells within the human culture as undifferentiated human ESCs tend to be more difficult to maintain in culture. Using a glutathione S-transferase antibody as a negative control, there was no significant enrichment for any of the amplicons in these human ESCs (data not shown). In summary, all these ChIP data clearly indicate in vivo occu-

24736

Transcriptional Regulation of Nanog by Oct4 and Sox2

FIG. 5. Knockdown of Oct4 or Sox2 reduces Nanog promoter activity. RNAi constructs targeting Sox2, Oct4, and EGFP (as a control) were co-transfected with a Nanog promoter-luciferase reporter construct (A) into mouse ESCs and assayed for luciferase activity 3 days after transfection (B). A construct without inhibitory RNA was used as a negative control (#). Luciferase activity was measured relative to the Renilla luciferase internal control. Standard deviations are indicated. ORF, open reading frame.

pancy of Oct4/OCT4 and Sox2/SOX2 on the Nanog/NANOG promoter in undifferentiated mouse and human ESCs. To establish the functional importance of Oct4 and Sox2 on Nanog promoter activity, we performed RNAi experiments to reduce the level of Oct4 or Sox2. Both Pou5f1 and Sox2 RNAi constructs used in this study were previously shown to specifically knock down their respective mRNAs (12). We co-transfected a Nanog-luciferase reporter construct with Pou5f1, Sox2, GFP, or empty RNAi constructs (Fig. 5A) into mouse ESCs and subsequently assayed for luciferase activity. Luciferase activity was reduced to almost that of background levels with the Pou5f1 and Sox2 RNAi, whereas the GFP and empty RNAi controls had no effect (Fig. 5B). These results establish a genetic link between the pluripotent activity of the Nanog proximal promoter and the levels of Sox2 and Oct4. DISCUSSION

Here we have characterized the proximal promoter of Nanog and established that the region from #289 to %117 relative to the transcription start site contains sufficient cis-regulatory information to recapitulate endogenous Nanog expression, at least as seen in undifferentiated ESCs and their differentiated derivatives. Within this promoter, we identified a composite sox-oct cis-regulatory module necessary for pluripotent expression and showed that the pluripotent transcription factors Sox2 and Oct4 bind this module both in vitro and in living mouse and human ESCs. These data are the first to directly link these three transcription factors within the pluripotent cell genetic regulatory network. Furthermore, as Nanog is known to be essential for the pluripotent phenotype (1, 2), our data further emphasize the position of the Sox2-Oct4 complex at the top of the pluripotent regulatory network hierarchy. Concurrent to our investigation reported here, Kuroda et al. (23) also describe Sox2 and Oct4 binding, respectively, to the sox and octamer elements within the Nanog promoter. Their data largely corroborate our findings but with some minor yet significant differences. Kuroda et al. (23) observed Sox2/Oct4 heterodimers binding to the Nanog sox-oct element by EMSA in nuclear extracts harvested from F9 cells and from an embryonic germ (EG) cell line but not in the R1 mouse ES cell line. Instead, in the R1 cells, they observed an unidentified Oct/Sox element binding factor, which they named PSBP (pluripotent cell-specific Sox element-binding protein), that they suggest binds cooperatively with Oct4 via an interaction with the sox element. This binding activity was present in R1 ES cell nuclear extracts but not observed in the F9 or EG cells. In contrast, we did not observe any significant differences between binding activities in F9 or our E14 mouse ES nuclear extracts (data not shown). The reason for this difference is unclear but may represent a difference between the R1 and E14 ES cells used or may be due to the different conditions used in the EMSA experiment. Signif-

icantly, in our EMSA experiments, we used 5 !g of the nonspecific DNA competitor poly(dGdC) as compared with 2 !g used by Kuroda et al. (23), suggesting that binding by PSBP may require less stringent binding conditions. Furthermore, they suggest that a competition between Sox2 and PSBP for heterodimerization with Oct4 on sox-oct elements might occur and that PSBP is favored in ES cells, putting into question the requirement of Sox2 for Nanog transcription in ES cells. Our results clearly establish that Sox2 is required for the ES cell-specific expression as when it is knocked down by RNAi, Nanog transcription is severely compromised (Fig. 5). One final difference is that they were unable to detect binding of the Sox2 monomer to the Nanog sox-oct element when either endogenously or exogenously expressed. They attribute this to a low sequence-specific DNA binding affinity. In contrast, whereas we did not detect the Sox2 monomer in E14 nuclear extracts binding to the Nanog sox-oct element, we did observe Sox2 binding both when it was expressed exogenously in NIH 3T3 cells (data not shown) and when it was expressed endogenously in E14 nuclear extracts when the Oct4/Sox2 heterodimer was disrupted by competition with the unlabeled NOmS (Fig. 3). This indicates that Sox2 does have an adequate affinity for the Nanog sox-oct element to be detected by EMSA and that in E14 extracts, the Sox2 monomer is not observed binding because the heterodimer is highly favored. This is supported by comparing the EMSA using the Nanog element with the element in Fgf4 (Fig. 3). The 3-bp spacing between the sox and octamer elements of Fgf4 leads to a decrease in cooperative binding by the heterodimer so that the Sox2 monomer can be readily detected. Kuroda et al. (23), on the other hand, do not observe any differences on Sox2 binding between the sox-oct elements of Nanog or Fgf4. Nanog is one of six genes to be identified as a target of Sox2-Oct4 synergism in pluripotent cells (Table I). The corresponding sox-oct composite element fits the classical description of an enhancer element as within these six target genes, it is orientation- and location-independent. We show that within the Nanog promoter, this element is transcriptionally active in either orientation (Fig. 2). Furthermore, there is no apparent association between location and orientation of the sox-oct composite element and the expression level of the corresponding gene (Table I) as measured by massively parallel signature sequencing in mouse ESCs (24). Considering the apparently allowable sequence variation within the sox-oct composite element between the six target genes (Table I), it is then surprising to see the invariant nature of this element in the Nanog promoter over an accumulated 250 million years of evolution. This conservation strongly suggests a functional requirement for this exact sequence; this may be through maintaining a precise dissociation constant of the Sox2-Oct4 complex on this element that is functionally important and/or an allosteric effect that this ligand may have on the binding of potential co-activators to the bound complex. There is yet no evidence for an allosteric effect, but certainly binding affinities of Sox2 and Oct4 on this composite element have been shown to differ between the target genes. For instance, here we characterized differences between elements from Nanog and Fgf4, with the heterodimer binding more efficiently to Nanog. This may reflect sequence differences within the sox and oct sites between these genes but also likely results from the additional 3 bp separating these sites in Fgf4. Additionally, unlike all other targets, the element from Fbx15 is unable to bind Oct4 as a monomer (10). Certainly subtle variations on the level of Nanog are known to lead to different cell fates (25) and as such, the level of transcription of its gene must be tightly regulated.

24737

Transcriptional Regulation of Nanog by Oct4 and Sox2 TABLE I Molecular features of known Sox2-Oct4 target genes Gene

a b c

sox element

Spacing

oct element

Reference

Orientation

Fgf4

CTTTGTT

3

ATGCTAAT

8

Forward

Utf1

CATTGTT

0

ATGCTAGT

9

Reverse

Sox2

CATTGTG

0

ATGCATAT

11

Forward

Fbx15

CATTGTT

0

ATGATAAA

10

Reverse

Nanog

CATTGTA

0

ATGCAAAA

This paper

Reverse

Pou5f1

CTTTGTT

0

ATGCATCT

12, 13

Reverse

Locationa

MPSS levelb

bp

tpmc

3022 3! 1838 3! 4041 3! 523 5! 181 5! 1991 5!

101 2009 105 129 112 388

Relative to the transcription start site. MPSS, massively parallel signature sequencing (24). tpm, tags per million in mouse ESCs.

The identification of the sox-oct element in Nanog from the African elephant suggests that the Sox2-Oct4 complex drives Nanog transcription in all eutherian mammals as the elephant represents the most distal clade within this group of mammals. During our data base searches, we identified the chick homolog of Nanog (ensemble ID, ENSGALG00000014319; Unigene ID, Gga.16924); interestingly, this gene did not contain any identifiable sox-oct composite element, suggesting that chick Nanog transcription is not regulated by the Sox2-Oct4 complex. Furthermore, the repeating tryptophan residues in the C-terminal end of mammalian Nanog, which are known to be involved in a potent transcriptional activation function in pluripotent cells (26), are not present in chick Nanog. Apparently, Nanog has evolved uniquely mammalian features (and possibly restricted to the eutherian mammals) both with respect to transcriptional regulation and with respect to protein function. This raises interesting questions regarding the evolution of the mammalian pluripotent cell transcriptional network. The fact that Nanog transcripts are apparently present in the initial formation of the mouse blastocyst in conceptuses lacking Oct4 (2) appears counterintuitive to our results but may be an indication that the molecular mechanisms activating the initial transcription of Nanog during development are different from those regulating maintenance of its expression in the pluripotent cell. The committed pluripotent cells in which Nanog has an essential function in maintaining, cells such as the ESC and those found in the epiblast of the expanded blastocyst, are developmentally distinct from their totipotent precursors. Although Nanog transcripts first appear in the inner cells of the morula, cells in the preimplantation mouse conceptus do not become committed to either the trophoblast or the epiblast lineage until after blastocyst formation (27, 28). This epiblast commitment, something that does not occur in Oct4 null blastocysts (5), may be associated with the onset of Oct4 regulation of Nanog. Here we have directly implicated Oct4 and Sox2 in the regulation of Nanog; this remains a rather simplistic view of the transcriptional regulation of Nanog. Both Oct4 and Sox2 are present in the nucleus of Nanog-negative cells of the morula and their precursors (6, 14), indicating that other molecular signals, besides the appropriate cellular location of Oct4 and Sox2, are required for pluripotent-specific expression of Nanog. Indeed, this may explain why we did not see transactivation of a Nanog promoter-luciferase reporter construct in co-transfection experiments with Oct4 and Sox2 expression constructs in 293 and 3T3 cells (data not shown). We are currently addressing whether these additional molecular signals are mediated through other cis-elements within this conserved promoter

(such as the M1 site characterized above) and/or through coactivators interacting with the Oct4/Sox2 complex. Further characterization of this cis-regulatory module will continue to enhance our understanding of the pluripotent stem cell genetic regulatory network. Acknowledgment—We thank Liu Shaw Cheng for technical assistance. REFERENCES 1. Mitsui, K., Tokuzawa, Y., Itoh, H., Segawa, K., Murakami, M., Takahashi, K., Maruyama, M., Maeda, M., and Yamanaka, S. (2003) Cell 113, 631– 642 2. Chambers, I., Colby, D., Robertson, M., Nichols, J., Lee, S., Tweedie, S., and Smith, A. (2003) Cell 113, 643– 655 3. Wang, S. H., Tsai, M. S., Chiang, M. F., and Li, H. (2003) Gene Expr. Patterns 3, 99 –103 4. Hart, A. H., Hartley, L., Ibrahim, M., and Robb, L. (2004) Dev. Dyn. 230, 187–198 5. Nichols, J., Zevnik, B., Anastassiadis, K., Niwa, H., Klewe-Nebenius, D., Chambers, I., Scho¨ler, H., and Smith, A. (1998) Cell 95, 379 –391 6. Avilion, A. A., Nicolis, S. K., Pevny, L. H., Perez, L., Vivian, N., and LovellBadge, R. (2003) Genes Dev. 17, 126 –140 7. Niwa, H., Miyazaki, J., and Smith, A. G. (2000) Nat. Genet. 24, 372–376 8. Yuan, H., Corbi, N., Basilico, C., and Dailey, L. (1995) Genes Dev. 9, 2635–2645 9. Nishimoto, M., Fukushima, A., Okuda, A., and Muramatsu, M. (1999) Mol. Cell. Biol. 19, 5453–5465 10. Tokuzawa, Y., Kaiho, E., Maruyama, M., Takahashi, K., Mitsui, K., Maeda, M., Niwa, H., and Yamanaka, S. (2003) Mol. Cell. Biol. 23, 2699 –2708 11. Tomioka, M., Nishimoto, M., Miyagi, S., Katayanagi, T., Fukui, N., Niwa, H., Muramatsu, M., and Okuda, A. (2002) Nucleic Acids Res. 30, 3202–3213 12. Chew, J.-L., Loh, Y.-H., Zhang, W., Chen, X., Tam, W.-L., Yeap, L.-S., Li, P., Ang, Y.-S., Lim, B., Robson, P., and Ng, H.-H. (2005) Mol. Cell. Biol., in press 13. Okumura-Nakanishi, S., Saito, M., Niwa, H., and Ishikawa, F. (2005) J. Biol. Chem. 280, 5307–5317 14. Palmieri, S. L., Peter, W., Hess, H., and Scho¨ler, H. R. (1994) Dev. Biol. 166, 259 –267 15. Booth, H. A., and Holland, P. W. (2004) Genomics 84, 229 –238 16. Cowan, C. A., Klimanskaya, I., McMahon, J., Atienza, J., Witmyer, J., Zucker, J. P., Wang, S., Morton, C. C., McMahon, A. P., Powers, D., and Melton, D. A. (2004) N. Engl. J. Med. 350, 1353–1356 17. Dignam, J. D., Lebovitz, R. M., and Roeder, R. G. (1983) Nucleic Acids Res. 11, 1475–1489 18. Wells, J., and Farnham, P. J. (2002) Methods (Orlando) 26, 48 –56 19. Ng, H. H., Robert, F., Young, R. A., and Struhl, K. (2003) Mol. Cell 11, 709 –719 20. Mayor, C., Brudno, M., Schwartz, J. R., Poliakov, A., Rubin, E. M., Frazer, K. A., Pachter, L. S., and Dubchak, I. (2000) Bioinformatics 16, 1046 –1047 21. Reme´nyi, A., Lins, K., Nissen, L. J., Reinbold, R., Scho¨ler, H. R., and Wilmanns, M. (2003) Genes Dev. 17, 2048 –2059 22. Williams, D. C., Jr., Cai, M., and Clore, G. M. (2004) J. Biol. Chem. 279, 1449 –1457 23. Kuroda, T., Tada, M., Kubota, H., Kimura, H., Hatano, S. Y., Suemori, H., Nakatsuji, N., and Tada, T. (2005) Mol. Cell. Biol. 25, 2475–2485 24. Wei, C. L., Miura, T., Robson, P., Lim, S. K., Xu, X. Q., Lee, M. Y., Gupta, S., Stanton, L., Luo, Y., Schmitt, J., Thies, S., Wang, W., Khrebtukova, I., Zhou, D., Liu, E. T., Ruan, Y. J., Rao, M., and Lim, B. (2005) Stem Cells (Durham) 23, 166 –185 25. Hatano, S. Y., Tada, M., Kimura, H., Yamaguchi, S., Kono, T., Nakano, T., Suemori, H., Nakatsuji, N., and Tada, T. (2005) Mech. Dev. 122, 67–79 26. Pan, G., and Pei, D. (2005) J. Biol. Chem. 280, 1401–1407 27. Rossant, J., and Vijh, K. M. (1980) Dev. Biol. 76, 475– 482 28. Chisholm, J. C., Johnson, M. H., Warren, P. D., Fleming, T. P., and Pickering, S. J. (1985) J. Embryol. Exp. Morphol. 86, 311–336