Document not found! Please try again

Cell-specific action and mutable structure of a transcription factor ...

5 downloads 5590 Views 2MB Size Report
tamer-containing reporter constructs (9). In this report, we identify the SCIP effector domain. We find that this domain (i) is located at the amino terminus of the.
Proc. Natl. Acad. Sci. USA Vol. 90, pp. 9978-9982, November 1993 Biochemistry

Cell-specific action and mutable structure of a transcription factor effector domain (transcriptional regulation/POU-domain genes/unstable DNA)

EDWIN S. MONUKI*t, RAINER KUHN**, AND GREG LEMKE*§ *Molecular Neurobiology Laboratory, The Salk Institute, 10010 North Torrey Pines Road, La Jolla, CA 92037; and tDepartment of Neurosciences, University of California at San Diego, La Jolla, CA 92093

Communicated by Stephen F. Heinemann, June 28, 1993

protein concentration that it activates transcription of octamer-containing reporter constructs (9). In this report, we identify the SCIP effector domain. We find that this domain (i) is located at the amino terminus ofthe protein, (ii) mediates both transactivation and repression, (iii) is an essential determinant of the cell-specific action of the protein, and (iv) contains an alanine homopolymer that apparently arose from the expansion of unstable DNA and that is of no demonstrable functional significance. We discuss the implications of these findings for the structure, function, and evolution of transcription factor effector domains.

ABSTRACT POU proteins are cell-specific transcription factors whose specificity of action has been attributed to protein-DNA and protein-protein interactions mediated by their DNA-binding (POU) domains. Here we report that transcriptional activation by SCIP, a POU protein expressed by developing Schwann cells, is dependent on an amino-terminal effector domain and that this domain mediates cell-specific transactivation in the complete absence of the POU domain. When fused to a heterologous DNA-binding domain, this SCIP domain is a potent transactivator in Schwann cells but is inactive in three heterologous cell types. The primary structure of the SCIP amino-terminal domain is novel but contains a polymorphic string of alanine residues similar to those found in several other transcription factors. Although previously hypothesized to be important for transcription factor activity, we find that the SCIP string is functionally irrelevant. We propose that homopolymers of alanine, and certain other amino acids, do not represent a motif required for transcription factor function but instead reflect regions of unstable DNA related to those associated with four recently characterized human genetic disorders.

MATERIALS AND METHODS Construction and Analysis of SCIP Mutants. Aminoterminal, carboxyl-terminal, and internal deletions of the rat SCIP cDNA were generated both by restriction enzyme digests and by polymerase chain reaction (PCR). The alanine deletion (Aala) constructs described in Fig. 4 were generated by PCR and resulted in the incorporation of a single cloningderived proline residue in place of the 11 alanines (residues 27-37) of the SCIP string. cDNA or PCR fragments were subcloned into a polylinker-modified version of the pCG expression vector, in which transcription is driven by the cytomegalovirus (CMV) promoter (10). All expression constructs were assessed by diagnostic restriction digests and by sequencing at junction points. To verify protein expression from these constructs, 10 ,ug of each was transfected into adherent HeLa cells. Nuclear extracts from these cells were then analyzed for protein expression by Western blot, using a polyclonal SCIP antibody, and for DNA-binding activity, using an end-labeled octamer probe in gel shift assays, all as described previously (6). Constructs described in this report were quantitatively expressed, with no obvious effects on nuclear transport, stability, or DNA binding, except for the expected inability of the CVCFC and CVCFC(145-451) mutant proteins to bind DNA (see Fig. 1A). For the analysis of GAL4 fusion proteins, a variety of SCIP fragments (see Fig. 2) were subcloned into a simian virus 40 (SV40) promoter-driven GAL4 vector (11), which expresses fusion proteins containing the DNA-binding domain of the yeast transcription factor GAL4 (residues 1-147). All of these constructs were verified by diagnostic restriction digests, sequenced at junction points, and tested for expression of functional protein by transfection into COS-7 monkey cells using calcium phosphate precipitation. Whole cell extracts were prepared from these cells, and gel shift analyses were performed (12) with a 32P-labeled HindIII-Xba I fragment

Transcription factors of the POU-domain family are essential regulators of cell-specific gene expression and development (1-3). These proteins are defined by the presence of the POU domain, a highly conserved DNA-binding domain composed of a class-specific homeodomain linked to an upstream "POU-specific" domain. Most POU proteins also share the ability to bind a set ofA+T-rich sequences related to the 8-bp "octamer" motif first described in immunoglobulin gene enhancers (1). Mammalian members of the POU domain family include the octamer-binding proteins Oct-i through -6, the pituitary transcription factor Pit-1/GHF-1, and the neural transcription factors Bmn-i through -4, among several others. In general, these proteins are expressed in restricted sets of cells, where they function as either transactivators or repressors of cell-specific genes (1-3). The transcription factor SCIP (4-6) [also termed Oct-6 (7) and Tst-1 (8)] is a POU protein expressed by the progenitors to myelinating glial cells in the mammalian peripheral and central nervous systems and by a subset of developing and mature neurons in the central nervous system. In Schwann cells, the principal glial cells of the peripheral nervous system, high-level expression of SCIP is confined to progenitor cells; when these cells exit the cell cycle and differentiate to form the myelin sheath, SCIP expression is dramatically reduced (5). As assayed in vitro, SCIP is capable of acting as either a repressor or an activator of transcription in these cells, depending on promoter context. The protein acts as a strong and specific repressor of the promoters of myelinspecific genes, for example, at the same time and at the same

Abbreviations: CAT, chloramphenicol acetyltransferase; CMV, cytomegalovirus; SBMA, spinal bulbar muscular atrophy; SV40, simian virus 40. tPresent address: Department of Biotechnology, CIBA-Geigy, Basel 4002, Switzerland. §To whom reprint requests should be addressed.

The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact.

9978

Proc. Natl. Acad. Sci. USA 90 (1993)

Biochemistry: Monuki et al. from the GAL4 CAT reporter (13), which contains five GAL4 binding sites and a bacterial chloramphenicol acetyltransferase (CAT) gene. Constructs that failed to activate the GAL4 CAT reporter upon transfection (see Figs. 2-4) were nonetheless quantitatively expressed, as assessed by their ability to recognize the GAL4 binding sites; no effects on protein stability or DNA-binding activity were apparent. Cell Culture and Transactivation Assays. Cells were grown in Dulbecco's modified Eagle's medium (DMEM) supplemented with 10% fetal bovine serum. For some experiments, culture medium was additionally supplemented with 4-10 AM forskolin. In general, Schwann cell medium contained 4 ,uM forskolin, which is required for Schwann cell division (5). RNA extractions and RNase protections (Fig. 1B) were performed (6) with an antisense rabbit ,B-globin probe to detect both octamer reporter and reference transcripts (14). For the analysis of ,-globin-based constructs (see Fig. 1B), adherent HeLa cells were cotransfected with 10 ,g of each SCIP plasmid, 10 ,ug of octamer reporter plasmid, and 2 ug of SV40 reference plasmid by calcium phosphate precipitation. For analysis of CAT-based GAL4 fusion constructs, purified Schwann cells in DMEM with 10% fetal bovine serum and 4 ,uM forskolin were cotransfected with 1 ,ug of GAL4 fusion construct, 5 ,ug of GAL4 CAT reporter, and 2 pg of reference plasmid (a CMV promoter-luciferase gene construct). Equivalent amounts of protein extract from each transfection were assayed for luciferase and then CAT activity, as described (5). Relative activities were determined by scintillation counting of modified and unmodified forms of [14C]chloramphenicol and calculated as percent conversion. Reactions resulting in >30% conversion were considered nonlinear. HeLa, NIH 3T3, and B78H1 cells cultured in DMEM with 10% fetal bovine serum were transfected as described for Schwann cells, except for the absence or presence of 10 ,M forskolin in selected cultures.

RESULTS AND DISCUSSION Identification of the SCIP Transactivation Domain. We first analyzed the ability of various forms of SCIP to regulate transcription driven from a simple octamer-containing /-globin reporter (Fig. 1A). In cotransfection experiments, we found that SCIP was able to activate this reporter in HeLa cells, which are SCIP- (Fig. 1B), and in cultured Schwann cells (data not shown). As expected, mutation of the single octamer site of the reporter, which prevents SCIP binding (6), destroyed activation (Fig. 1B). To determine which portions of the SCIP protein are required for transactivation, mutated SCIP constructs (see Materials and Methods) were analyzed for their ability to activate the octamer reporter in HeLa cells (Fig. 1B). These experiments revealed the importance of two domains. The first is the SCIP POU domain, which is necessary and sufficient for DNA binding. Mutations in this domain that disrupted its DNA-binding activity [CVCFC and CVCFC(145-451) constructs] also destroyed activation. The second domain required for activation lies within a region near the SCIP amino terminus. Although removal of the first 20 amino acids did not significantly affect SCIP activity, deletion of the first 145 amino acids completely destroyed activity. Constructs that did not contain amino acids 21-145 failed to activate our reporter. In contrast, deletion of residues 146-199 was without obvious effect. Results similar to these have been recently reported for Oct-6 (mouse SCIP) transactivation of a different octamer-containing reporter (17). We next prepared constructs in which selected regions of SCIP were fused to the DNA-binding domain (residues 1-147) of the yeast transcription factor GAL4 (11) (Fig. 2A). After tests for expression and DNA-binding activity, these constructs were transfected into Schwann cells and assayed

A

9979

3-451 21-451 145-451

I

145-408 238-402 3-145/200-451

II

VIM1111/11Z/11/111//111/11A

21-50/145-451

I

uz/Xz"ozx>>/

i// /////// zf 7/, / / 7 / 7, 7iMF-A

CVCFC

CVCFC 145-451 Octamer reporter

_

\

f3-Gbobin reporter

Octamer TATA

B

Octamer reporter

0}O

0 Qa(

ClJ

Mutant reporter

~~~~~~~~~u

U')

-

C)

LO CM

S

O

O

S

LI?

It l

CU a)CO)

_

_i

_

/

I

IL

0) C0

LL

c)

0

CU

0)

U

0

0L

CO

0.

ISP.4PW

Reporter transcript

Reference transcript

FIG. 1. SCIP activation of an octamer reporter. (A) Schematic of SCIP expression constructs. Open boxes, SCIP open reading frame; hatched boxes, SCIP POU domain; black boxes, RVWFC CVCFC mutation in the SCIP homeodomain. [A similar mutation in Pit-1 destroys DNA binding (15).] Numbers correspond to SCIP amino acid positions. (B) Transactivation in HeLa cells as detected by RNase protection. Octamer (ATTTGCAT) and mutant (AT.jTICAQ) reporters contain the octamer site S bp upstream of the rabbit 3-globin TATA box and reporter gene (14). The SV40 reference plasmid contains the SV40 promoter upstream of the j-globin gene (14). Positions of octamer reporter and SV40 promoter-driven reference transcripts are indicated. Results from transfections with a SCIP mutant carrying a WFC -- AAA mutation in its homeodomain [analogous to an Oct-1 mutation that destroys DNA binding (16)] were indistinguishable from those for the CVCFC mutant. pCGS, the expression plasmid used for all constructs, served as a "no SCIP cDNA" control.

for their ability to activate a CAT reporter containing GAL4 binding sites (13) (Fig. 2B). The GAL4-SCIP(3-451) fusion protein, which contains nearly the entire SCIP amino acid sequence, transactivated the reporter, while the GAL4SCIP(145-451) fusion protein, which lacks the aminoterminal domain, did not. More importantly, constructs that contained only the SCIP amino-terminal region (residues 3-146) were exceptionally strong transactivators in Schwann cells (Fig. 2B). These fusions were aa100-fold more potent than GAL4-SCIP(3-451) (Fig. 2B) or a fusion containing the entire cAMP response element-binding protein CREB (19) (data not shown) and were comparable in potency to GAL4VP16 (Fig. 2B). This latter fusion protein contains the transactivation domain of the herpes simplex virus VP16 protein, and is one of the strongest transactivators known (18). Amino- and carboxyl-terminal deletions of the SCIP(3-146) amino-terminal region destroyed its transactivation activity (Fig. 2B). Cell Specificity of the SCIP Transactivation Domain. We then tested these same GAL4 fusion proteins for their ability to transactivate the same GAL4 CAT reporter in HeLa cells. Remarkably, GAL4-SCIP(3-146) was unable to significantly activate transcription in these cells (Fig. 3A), in marked

9980

Proc. Natl. Acad. Sci. USA 90 (1993)

Biochemistry: Monuki et al.

A

A GAL4

GAL4-SCIP(3-451)

GAL4-SCIP(145-451) GAL4-SCIP(3-1 99) GAL4-SCIP(3-1 46) GAL4-SCIP(3-65) GAL4-SCIP(3-50) GAL4-SCIP(21 -146)

_MOPz

HeLa

III

0

Relative activity (log10) 3 4 1 2

.I

j

GAL4 GAL4-SCIP(3-451)

_m~I

Re

GAL4-SCIP(3-1 46)

GAL4-SCIP(52-146)

GAL4-VP1 6 GAL4 reporter

GAL4

CAT reporter

TATA

B

binding sites

B Schwann cells

0

Relative activity (log,() 1 2 3 .. ..

1.

...-.

la

4

.

--dlt

carrier

GAL4

GAL4

GAL4-SCIP(3-1 46)

GAL4-SCIP(3-451)

GAL4-SCIP(3-146)

GAL4-SCIP(3-199)

*

*

_:

o,.

.0 I

100 .

I

I

*C:

..

+ forskolin

...0*

Relative activity 50 100 ... .. .

B78H1

GAL4-VP1 6 Relative activity 50

GAL4

0

100

i

GAL4

.

.

GAL4-SCIP(3-1 46)

..

I

0

GAL4-VP1 6

0*0

GAL4-SCIP(145-451)

Relative activity 50

NIH3T3

*.4

GAL4-SCIP(3-1 46) GAL4-VP1 6

GAL4-SCIP(3-1 46) GAL4-SCIP(3-1 46) + forskolin

GAL4-SCIP(3-65)

GAL4-SCIP(3-50)

GAL4-SCIP(21-146) GAL4-SCIP(52-1 46)

.4

FIG. 2. Transactivation by GAL4-SCIP fusion proteins in Schwann cells. (A) Schematic of constructs, as described in Fig. 1 legend. Dark hatched boxes, GAL4 DNA-binding domain. The GAL4 CAT reporter (13) contains the bacterial CAT gene driven by five GAL4 binding sites upstream of the adenovirus E1B TATA box (not drawn to scale). (B) Transactivation in Schwann cells. GAL4VP16 contains the carboxyl-terminal 78 residues of the VP16 protein of herpes simplex virus 1 (18). In this and subsequent figures, the bar graph represents CAT activity after normalization to luciferase reference activity (see Materials and Methods). Activities were calculated relative to GAL4 CAT reporter activity transfected with carrier alone, which was set to 1 (100) (Upper) or relative to GAL4-SCIP(3-146) activity, which was set to 100 (Lower). Error bars indicate standard error of the mean (SEM) for at least two independent transfections. Note that Upper and Lower graphs are plotted on logarithmic and linear scales, respectively. contrast to GAL4-VP16, which,

as expected from previous studies (20, 21), retained its ability to transactivate. We observed a 280-fold difference in relative activity between the GAL4-VP16 and GAL4-SCIP(3-146) fusion proteins in HeLa cells, compared with only a 4-fold difference in Schwann cells (Fig. 3A), even though our gel shift controls indicated that the GAL4-SCIP protein was expressed equally well and demonstrated comparable DNA-binding activity in both cell types. Similar results were obtained with the NIH 3T3 mouse fibroblast and B78H1 mouse melanoma cell lines (Fig. 3B). Since forskolin, which elevates intracellular cAMP and thereby indirectly activates protein kinase A, was required for our Schwann cell cultures (see Materials and Methods), we tested the effect of forskolin on transactivation by GAL4-SCIP(3-146) in NIH 3T3 and B78H1 cells. Even in

4:

FIG. 3. Cell-specific transactivation mediated by the SCIP amino-terminal domain. (A) Cotransfections in HeLa cells. Activities were calculated relative to activity of the GAL4 DNA-binding domain alone, which was set to 1 (100). Error bars indicate SEM of at least two independent transfections. Note that the bar graph is plotted on a logarithmic scale. (B) Cotransfections in NIH 3T3 cells and B78H1 cells. Forskolin (10 ,uM) was added to selected cultures, as indicated. Activities were calculated relative to GAL4-VP16 activity, which was set to 100. Error bars indicate SEM of two independent transfections. Note that bar graphs are plotted on a linear scale.

the presence of forskolin, this fusion protein was unable to transactivate in these cells (Fig. 3B), indicating that the observed cell-specific differences were not due to phosphorylation of the SCIP amino-terminal domain by protein kinase A. (There are no consensus protein kinase A sites in this domain.) Thus, unlike the nonspecific activity of GAL4VP16, transactivation by the isolated SCIP amino-terminal domain is cell-specific. This specificity differs from that attributed to the transactivation domains of POU proteins Oct-1 and Oct-2, which appear to mediate promoter-selective rather than cell-specific activation (22). This cell specificity also differs from the cell typeindependent activation of the simple octamer reporter by wild-type SCIP. Interestingly, related phenomena have been reported for other POU proteins. For example, an enhancer containing six octamer sites is highly active in Oct-2+ B cells, but not in Oct-2- HeLa cells. Nonetheless, transfected Oct-2 is unable to activate transcription from this enhancer in HeLa cells (14). Similarly, Suzuki et al. (7) have found that in HeLa cells neither Oct-2, Oct-3/4, nor Oct-6 (mouse SCIP) can activate a reporter in which six octamer sites are located in an enhancer 150 bp upstream of a TATA box (the "6W" enhancer); this enhancer is active only in cell lines that express Oct-3/4 and SCIP (7, 23). At the same time, SCIP,

Biochemistry: Monuki et al. Oct-2, and Oct-3/4 all successfully activate, in HeLa cells, a reporter that contains a single octamer site located only 5 bp upstream of a TATA box (7, 23). (This is the reporter we have used.) These and related results have been interpreted to suggest that POU proteins require an additional factor or factors to function from an enhancer but do not require these proteins to activate transcription from an octamer site located close to the TATA box. Several lines of evidence suggest that these additional factors are adaptor proteins that bridge POU proteins with components of the basal transcription complex (16, 24, 25). Our results support a similar model for SCIP, involving two distinct mechanisms for transcriptional activation mediated by its amino-terminal domain. The first of these may involve direct contact between SCIP and a ubiquitous component of the basal transcription complex. This mechanism would account for the cell type-independent activation of the simple octamer reporter by wild-type SCIP; transactivation of this same reporter by different POU proteins may reflect a general requirement for the POU domain. The second mechanism requires an additional cell-specific factor, or adaptor, which facilitates interaction between SCIP and proteins of the basal transcription complex. This mechanism would account for the cell specificity exhibited by GAL4-SCIP(3146) and for the ability of the SCIP amino-terminal domain to function in the absence of the POU domain. Importantly, such a mechanism is also consistent with our recent demonstration that this same amino-terminal domain is required for SCIP repression of the major myelin-specific genes expressed by Schwann cells (9). This repression is best modeled by a "quenching" reaction in which the domain negatively interacts with a cell-specific adaptor normally required for transactivation of these genes (9). Taken together, our data on cell-specific transactivation and repression mediated by the SCIP amino-terminal domain indicate that transcription factor effector domains may often be important determinants of cell-specific gene expression. The SCIP Alanine String. The SCIP amino-terminal domain is rich in alanine and glycine but does not carry any previously characterized transactivation motif (26). One region (residues 27-37) contains an uninterrupted string of 11 alanines. Many other transcription factors contain similar alanine strings outside their DNA-binding domains (Fig. 4A and data not shown), which has led to considerable speculation about a functional role for this motif (32, 34). The length of the SCIP alanine string, however, is very poorly conserved between rat and mouse SCIP (5, 7, 33) (Fig. 4A), which are otherwise identical except for two conservative amino acid substitutions. In addition, pronounced variability in string length and amino acid composition extends to the aminoterminal regions of three other POU proteins-Bmn-1, Brn-2, and Brn-4-with which SCIP otherwise shares extensive homology (31). Given these observations, we decided to delete the alanine string from the GAL4-SCIP(3-146) activator and compare the activities of the deleted and intact fusion proteins. The protein in which the alanine string was deleted retained full activity as a transactivator (Fig. 4B). In fact, this deletion was the only alteration to the SCIP amino terminal domain we tested that did not affect transactivation. Deletion neither diminished nor enhanced transactivation, suggesting that the alanine string is irrelevant for SCIP function. This conclusion is further supported by our finding that deletion of the string similarly has no effect on the ability of full-length SCIP to repress major myelin (P0) gene transcription (Fig. 4C). Amino Acid Strings, Transcription Factor Effector Domains, and Unstable DNA. All but one of the alanine residues in the SCIP string are encoded by GCG or GCC codons (Fig. 4B). Lack of third-position degeneracy and poor evolutionary conservation are typical of the alanine strings found in many

Proc. Natl. Acad. Sci. USA 90 (1993)

9981

A En (62-98) Eve (163-182) E74a (452-480)

PAMAFDAAAADAAAAAAAAAHAHAAALQQRLSGSGSP PYAPAAAAAAAAAAAVATNP

PATSASAAAAAAAAAAAAAILHTGTFI.,HP PTANAASANAAAAAAAAASNSTAIP P HAAAAAAAAAAAAVEASSP PI.YSKYKAAAAAAAAAAAAAAGEAINP

Rapi (8-32) Brn-1 (99-118)

Hox4.3 (8-34) SCIP (21-58)

PLMHPDAAAAAAAAAAAERLHAGAAYREVQKLMIHHEWL D

A

A

A

A A

A

A

A

A

A

A

E

R SCIP ''ATGCCGCCGCGGCGGCGGCAGCGGCGGCGGCGGCCGAG M SCIP (Oct-6)(7) GA1'GCC---GCGGCGGCGGCACCGGCG------GCCGAG M SCIP (Oct-6)(33) GATGCCGCCGCGGCGGCGGCAGCGGCG------GCCGAG

B

Relative activity 0

50

100

GAL4

GAL4-SCIP(3-146) GAL4-SCIP(Aalal)

_

1 44^*

EZZZI

GAL4-SCIP(Aala2)

9@

*0

_J

S.

c

Fold repression 5 10

0 .

W.n

*

pCGS (100) SCIP(3-451)(30)

r ME= SCIP(3-451) (100) L Z ZEO

SCIPAala (30)

SCIPAala (100)

O

MEi=

ElEl

SCIPAalaFS (100) E0El IJ

.40

.4. 4*4e

ZJ

w4o **

I

FIG. 4. Functional irrelevance of the SCIP alanine string. (A) (Upper) Selected examples of alanine strings from SCIP and other transcription factors, shown in single-letter amino acid code. Numbers in parentheses correspond to amino acid positions of engrailed (En) (27), even-skipped (Eve) (28), E74a (29), Rapl (30), Brm-1 (31), Hox4.3 (32), and SCIP. (Lower) Alignment of rat (R) SCIP (5) and mouse (M) SCIP (Oct-6) (7, 33) amino acid and nucleotide sequences near the alanine string. Rat SCIP contains 11 alanines in this string, while the two mouse SCIP sequences contain 8 and 9 alanines. (B) Assay of transactivation activity. Cotransfections were performed in Schwann cells. Schematics of GAL4 expression constructs are as described in Fig. 2. Aalal and Aala2 are two identical but independently isolated GAL4 fusion protein subclones in which the alanine string of SCIP is replaced with a single proline residue. Activities were calculated relative to GAL4-SCIP(3-146) activity, which was set to 100. Error bars indicate SEM of two independent transfections. Note that the bar graph is plotted on a linear scale. (C) Assay of repression activity. Cotransfections were performed in Schwann cells. Repression activity was measured against a construct in which CAT expression was driven by the 1.1-kb promoter of the rat major myelin Po gene, exactly as described (9). Fold repression was calculated relative to pCGS (set to 1.0), also as described (9). Schematics of full-length SCIP expression constructs are as described in Fig. 1. Plasmid concentrations used in cotransfections were 30 ng or 100 ng of SCIP expression plasmid (as indicated), 5 pg of Po-CAT reporter plasmid (9), and 2 ,ug of luciferase reference plasmid (see Materials and Methods). Luciferase-equivalent amounts of transfected cell extracts were assayed for CAT activity, as described (9). SCIPAala is identical to wild-type SCIP, except that the alanine string is replaced by a single proline residue. SCIPAalaFS is a negative control that also carries this string deletion but, in addition, contains a single-base frameshift deletion at amino acid 146. Error bars represent SEM of fold repression exhibited by two identical but independently isolated SCIPAala subclones.

other transcription factors. For example, the 14-residue alanine string of Hox-4.3 (Fig. 4A) is principally encoded by a single codon (GCG), and this string is absent from the otherwise highly conserved paralogs Hox-3.1 and Hox-2.4 (32). Similarly, the 14-residue alanine string of the engrailed protein from Drosophila melanogaster (Fig. 4A) is shortened

9982

Biochemistry: Monuki et al.

to three residues in the same protein from Drosophila virilis (35). Interestingly, the DNA that encodes these evolutionarily unstable strings is similar to the unstable DNA that underlies four human genetic disorders. These diseasesfragile X syndrome (36), chromosome X-linked spinal bulbar muscular atrophy (SBMA) (37), myotonic dystrophy (38), and Huntington disease (39)-all involve the abnormal expansion of G+C-rich trinucleotide repeats. These repeats are unstable, exhibiting a high degree of length polymorphism in both the disease and general populations. When located within protein coding sequence, they encode amino acid strings that exhibit significant length polymorphism. In SBMA, the unstable DNA lies within sequence that encodes a transcription factor, the androgen receptor (37). The trinucleotide runs that expand and contract in these human diseases do not encode alanine strings. The CAG repeats of SBMA, for example, encode polyglutamine. It is important to note, however, that in addition to alanine, transcription factors are, as a rule, studded with highly polymorphic homopolymeric strings of glutamine, glycine, histidine, and proline. All of these amino acids are encoded by G+C-rich codons, which are thought to be prone to expansion and contraction through slipped mispairing during DNA replication (40). This instability occasionally leads to evolutionary changes in both string length and amino acid composition within families of closely related transcription factors. In the mouse class III POU domain family, for example, the alanine string of SCIP (encoded primarily by ... GCG-GCG-GCG ... ) appears at the same position in Bmn-1 as a longer glycine string (encoded by ... GGC-GGCGGC ... ) due to a frameshift mutation that introduces an extra G after the fourth codon of the string (31). Thus, the particular amino acid present in any string may be of minor

importance. We therefore propose that most or all transcription factor strings initially arose from the replication of unstable DNA. While it seems reasonable to speculate that such DNA might serve as a mechanism for protein evolution (41), our observations that (i) many transcription factor strings are not conserved through evolution and (ii) excision of the SCIP alanine string is without effect on transactivation or repression suggest that poly(amino acid) strings may often be irrelevant to transcription factor function. Recent work in several laboratories supports the hypothesis that expansion and contraction of unstable DNA during replication may be a widespread phenomenon that is either routinely corrected or selected against (42, 43). Although amino acid homopolymers generated by triplet expansion are not unique to transcription factor effector domains, these domains are unusual in their tolerance of such homopolymers. (Transcription factor DNA-binding domains, for example, do not carry amino acid strings.) This tolerance in turn suggests that strings may be gratuitous loops that link functional subdomains of contiguous amino acids and that such subdomains may assemble into complete effector domains without significant regard to the length of these intervening loops. We thank Dan Ortunlo and Chris McGaugh for excellent technical assistance and Drs. Pattrick Matthias, Winship Herr, Kathy Jones, and Marc Montminy for expression and reporter plasmids. This work was supported by grants from the National Multiple Sclerosis Society, the Rita Allen Foundation, the Pew Scholars Program, and the National Institutes of Health (to G.L.). E.S.M. is supported by the Medical Scientist Training Program at the University of California, San Diego, and R.K. was supported by a postdoctoral fellowship from the Deutsche Forschungsgemeinschaft.

Proc. Natl. Acad. Sci. USA 90 (1993) 1. 2. 3. 4. 5. 6. 7. 8.

9. 10. 11. 12. 13. 14. 15.

16. 17.

18. 19.

20. 21. 22. 23.

24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34.

35. 36.

37. 38. 39. 40.

41. 42.

43.

Scholer, H. R. (1991) Trends Genet. 7, 323-329. Ruvkun, G. & Finney, M. (1991) Cell 64, 475-478. Rosenfeld, M. G. (1991) Genes Dev. 5, 897-907. Monuki, E. S., Weinmaster, G., Kuhn, R. & Lemke, G. (1989) Neuron 3, 783-793. Monuki, E. S., Kuhn, R., Weinmaster, G., Trapp, B. D. & Lemke, G. (1990) Science 249, 1300-1303. Kuhn, R., Monuki, E. S. & Lemke, G. (1991) Mol. Cell. Biol. 11, 4642-4650. Suzuki, N., Rohdewohld, H., Neuman, T., Gruss, P. & Scholer, H. R. (1990) EMBO J. 9, 3723-3732. He, X., Treacy, M. N., Simmons, D. M., Ingraham, H. A., Swanson, L. W. & Rosenfeld, M. G. (1989) Nature (London) 340, 35-41. Monuki, E. S., Kuhn, R. & Lemke, G. (1993) Mech. Dev. 42,15-32. Tanaka, M. & Herr, W. (1990) Cell 60, 375-386. Sadowski, I. & Ptashne, M. (1989) Nucleic Acids Res. 17, 7539. Schmitz, M. L. & Baeuerle, P. A. (1991) EMBO J. 10, 3805-3817. Lillie, J. W. & Green, M. R. (1989) Nature (London) 338, 39-44. Muller-Immergluck, M. M., Schaffner, W. & Matthias, P. (1990) EMBO J. 9, 1625-1634. Li, S., Crenshaw, E. B., Rawson, E. J., Simmons, D. M., Swanson, L. W. & Rosenfeld, M. G. (1990) Nature (London) 347, 528533. Stem, S., Tanaka, M. & Herr, W. (1989) Nature (London) 341, 624-630. Meijer, D., Graus, A. & Grosveld, G. (1992) Nucleic Acids Res. 20, 2241-2247. Sadowski, I., Ma, J., Triezenberg, S. & Ptashne, M. (1988) Nature (London) 335, 563-564. Gonzales, G. A., Yamamoto, K., Fischer, W. H., Karr, D., Menzel, P., Biggs, W., III, Vale, W. W. & Montminy, M. R. (1989) Nature (London) 337, 749-752. Stringer, K. F., Ingles, C. J. & Greenblatt, J. (1990) Nature (London) 345, 783-786. Lin, Y.-S. & Green, M. R. (1991) Cell 64, 971-981. Tanaka, M., Lai, J.-S. & Herr, W. (1992) Cell 68, 755-767. Scholer, H. R., Dressler, G. R., Balling, R., Rohdewohld, H. & Gruss, P. (1990) EMBO J. 9, 2185-2195. Kristie, T. M., LeBowitz, J. H. & Sharp, P. A. (1989) EMBO J. 8, 4229-4238. Scholer, H. R., Ciesiolka, T. & Gruss, P. (1991) Cell 66, 291-304. Mitchell, P. J. & Tjian, R. (1989) Science 245, 371-378. Poole, S. J., Kauvar, L. M., Drees, B. & Kornberg, T. (1985) Cell 40, 37-43. Frasch, M., Hoey, T., Rushlow, C.; Doyle, H. & Levine, M. (1987) EMBO J. 6, 749-759. Burtis, K. C., Thummel, C. S., Jones, C. W., Karim, F. D. & Hogness, D. S. (1990) Cell 61, 85-99. Shore, D. & Nasmyth, K. (1987) Cell 51, 721-732. Hara, Y., Rovescalli, A. C., Kim, Y. & Nirenberg, M. (1992) Proc. Natl. Acad. Sci. USA 89, 3280-3284. Izpisua-Belmonte, J.-C., Dolle, P., Renucci, A., Zappavigna, V., Falkenstein, H. & Duboule, D. (1990) Development (Cambridge, UK) 110, 733-745. Meijer, D., Graus, A., Kraay, R., Langeveld, A., Mulder, M. P. & Grosveld, G. (1990) Nucleic Acids Res. 18, 7357-7365. Licht, J. D., Grossel, M. J., Figge, J. & Hansen, U. M. (1990) Nature (London) 346, 76-79. Kassis, J. A., Poole, S. J., Wright, D. K. & O'Farrell, P. H. (1986) EMBO J. 5, 3583-3589. Verkerk, A. J. M. H., Pieretti, M., Sutcliffe, J. S., Fu, Y. H., Kuhl, D. P., Pizzuti, A., Reiner, O., Richards, S., Victoria, M. F. & Zhang, F. P. (1991) Cell 65, 905-914. La Spada, A. R., Wilson, E. M., Lubahn, D. B., Harding, A. E. & Fischbeck, K. H. (1991) Nature (London) 352, 77-79. Brook, J. D., McCurrach, M. E., Harley, H. G., Buckler, A. J., Church, D., Aburatani, H., Hunter, K., Stanton, V. P., Thirion, J. P. & Hudson, T. (1992) Cell 68, 799-808. Huntington's Disease Collaborative Research Group (1993) Cell 72, 971-983. Jeffreys, A. J., Royle, N. J., Wilson, V. & Wong, Z. (1988) Nature (London) 332, 278-281. Richards, R. I. & Sutherland, G. R. (1992) Cell 70, 709-712. Aaltonen, A., Peltomaki, P., Leach, F. S., Sistonen, P., Pylkkanen, L., Mecklin, J. P., Jarvinen, H., Powell, S. M., Jen, J., Hamilton, S. R., Kinzler, K. W., Vogelstein, B. & de la Chapelle, A. (1993) Science 260, 812-815. Thibodeau, S. N., Bren, G. & Schaid, D. (1993) Science 260, 816-818.

Suggest Documents