mutants fall into several classes, defining different stages of embryonic development. At the first stage, gap genes divide the embryo into three large domains on ...
The EMBO Journal vol.8 no. 1 pp.219-227, 1989
The homeotic
Sex Combs Reduced of Drosophila: gene structure and embryonic expression gene
Peter K.LeMotte, Atsushi Kuroiwa1, Liselotte l.Fessler2 and Walter J.Gehring Biozentrum, University of Basel, Klingelbergstrasse 70. CH4056 Basel, Switzerland, 'Metropolitan Institute for Neurosciences, 2-6 Musashidai, Fuchu-City, Tokyo 183, Japan and 2Molecular Biology Institute, University of California, Los Angeles, CA 90024, USA
Communicated by W.J.Gehring
The homeotic gene Sex Combs Reduced (Scr) of Drosophila is required during embryogenesis for labial and first thoracic segment development. We define the Scr gene structure, showing that the major embryonic transcript is proximal to thefitshi tarazu gene, and report the sequence of the transcript, which encodes a 413-amino acid, homeodomain-containing protein. We describe Scr protein distribution throughout embryogenesis. Expression begins at gastrulation and is eventually apparent in three tissues, epidermis, nervous system and visceral mesoderm, though there are clear contrasts in the domains of expression in these three tissues. Key words: homeotic gene/Sex Combs Reduced/Drosophilal development
Introduction Initial development of the Drosophila embryo is dependent upon positional information laid down by maternal genes. Many of the genes required for creating the initial positional information as well as for later stages of embryonic pattern formation have been identified in large mutant screens for visible defects in embryonic development (Nusslein-Volhard and Wieschaus, 1980; Jurgens et al., 1984; NiissleinVolhard et al., 1984; Wieschaus et al., 1984). The zygotic mutants fall into several classes, defining different stages of embryonic development. At the first stage, gap genes divide the embryo into three large domains on the anterior posterior axis. Based on this activity the pair-rule genes create seven stripes of expression along the embryo, and subsequently the segment polarity genes are expressed in 14 stripes to define the future 14 segments comprising most of the embryo. Although this process results in an embryo which is subdivided into segments, another class of genes is required to further differentiate the individual segments. This class, the homeotic genes, begins to be transcribed at blastoderm in specific segments of the embryo. The activity of these genes is required for segment identity since mutations in these genes cause the homeotic transformation of one segment into another segment [for a review of the different gene classes see Akam (1987)]. Biochemically, the functions of the products of genes involved in segmentation and in the determination of segment identity are not well defined. Some suggestion comes from amino acid sequence similarity with other proteins which -
©IRL Press
are better characterized. For instance, the products of the gap genes Kruppel and hunchback show sequence similarity to the Xenopus transcription factor TFHIa, a zinc finger protein (Tautz et al., 1987); a segment polarity gene, wingless, is similar in sequence to the oncogene int-I (Rijsewijk et al., 1987), which may function as a growth factor in the regulation of cell growth; and the homeoboxcontaining genes (which include maternal genes, segmentation genes and homeotic selector genes) show sequence similarity to the yeast proteins al/a2 (Shepherd et al., 1984), P1 (Gehring, 1987) and PH02 (Biirglin, 1988) which function as DNA-binding proteins. Structural analyses by nuclear magnetic resonance spectroscopy of the homeodomain from a Drosophila protein indicate that the homeodomain contains a helix-turn-helix motif as is found in prokaryotic gene regulatory proteins and functional studies show that it serves a sequence-specific DNA-binding function (M.Miiller, M.Affolter, W.Leupin, K.Wuthrich and W.J. Gehring, submitted for publication and G.Otting, Y.Qian, M.Muller, W.J.Gehring and K.Withrich, submitted for
publication).
The work we report here is concerned with the homeotic gene Sex Combs Reduced (Scr). Mutants in Scr were first isolated and characterized by Kaufman and co-workers (Kaufman et al., 1980). Scr is located in a cluster of developmentally important genes known as the Antennapedia complex (ANT-C). Scr mutations have been localized genetically and on the physical DNA map both proximal and distal to the segmentation gene fushi tarazu (ftz) (Kaufman et al., 1980; Denell et al., 1981; Wakimoto and Kaufman, 1981; Hazelrigg and Kaufman, 1983; Scott et al., 1983; Kuroiwa et al., 1985). We have previously reported the location of transcripts within the ANT-C proximal to ftz which are likely to correspond to the Scr gene (Kuroiwa et al., 1985). We had isolated cDNA clones which contained sequences from two exons located 15 kb apart. The second exon also contained a homeobox. In situ hybridization to embryo sections showed that this transcript was localized at the appropriate times to regions where the Scr gene is required in development, suggesting this transcript corresponded to the Scr gene. Analysis of overlapping deficiencies was consistent with the assignment of this transcript to the Scr gene. To be able to analyse further at the molecular level the role of Scr in development, and the function of its different regulatory elements, we have determined more precisely the physical structure of the gene, in particular with respect toftz. We report here the structure of the 5' end of the gene, including an additional 5' exon, and the complete sequence of the message. To determine the locations and times where Scr may be functioning in embryonic development we have raised antibodies to the Scr protein and examined its expression pattern in the different germ layers. Our results of Scr protein distribution corroborate and extend results reported by others (Mahaffey and Kaufmnan, 1987; Riley et al., 1987; Carroll et al., 1988). 219
P.K.LeMotte et a!.
T(2:3)ScrXT145 I 240
I
230
I
220
I
210
I
T(2;3)ScrXF5
In(3R) o Msc1
I
200
190
170
180
I
160
l
I
150
l
l
o-
Dfd
Antp
x
f tz
Scr
Fig. 1. Map of a portion of the ANT-C (described in Kuroiwa et al., 1985). Numbers are in kilobases. Below the line are exons of transcripts of Scr, ftz, X and the 3' ends of Antp and Dfd transcripts. Above the line are the approximate locations of the breakpoints of three Scr mutants.
Scr shows distinct expression patterns in the epidermis, the system and the mesoderm. These patterns of expression suggest that a common regulatory hierarchy involving Scr and other homeotic genes may be functioning in these three tissues to direct very different pathways of cellular differentiation. nervous
Results Scr gene structure and sequence Chromosomal breakpoints of Scr mutants have been identified both distal and proximal toftz (Scott et al., 1983; Kuroiwa et al., 1985). We have identified and include in Figure 1 the approximate location of two additional Scr mutations ScrXT145 and ScrXFS (isolated by G.Jurgens, unpublished data). The former maps to the 3' region and the latter to the 5' region of the Scr transcript. We have now isolated cDNA clones from a random-primed cDNA library, which have allowed identification of the first exon in this transcript. Figure 1 illustrates the location of the three exons in relation to a chromosomal walk in the ANT-C (Garber et al., 1983; Kuroiwa et al., 1985). SI protection of a genomic DNA fragment showed the position of exon 1 to be at 200-201 kb in the chromosomal walk (Figure 2). It is separated from exon 2 by a 6-kb intron. Using as a primer a DNA fragment from the 5' exon, primer extension on total embryonic RNA (0-22 h) produced a single extension product (Figure 2), which also corresponded to the position of SI protection of the 5' end of exon 1. The DNA sequence at the transcription initiation site AGCAGTC corresponds closely to the consensus sequence for Drosophila transcription initiation, ATCAG/TTC/T (Hultmark et al., 1986), supporting the identification of this start site for Scr transcription during embryogenesis. Scr transcription appears to begin either at the first or second A of this sequence, though to a greater extent at the second A. We have sequenced the cDNA of the entire Scr message which has a total length of 4135 bp (to the 3' end of the cDNA). Analysis of open reading frames reveals a single long open reading frame which initiates near the beginning of exon 2 and includes the homeobox. The DNA sequence of the message and conceptual translation of the long open reading frame is shown in Figure 3. Three bases just 5' to the homeobox are corrected from our previously published partial Scr sequence (Kuroiwa et al., 1985). The open reading frame would encode a protein of 413 amino acids, showing a calculated net positive change of +35. As we reported previously (Kuroiwa et al., 1985), the homeodomain of the protein shows a high degree of sequence similarity to the Antp class of homeoproteins. It is most
220
A
{-T C A
C. T
roT
T C