A novel, tissue-specific, Drosophila homeobox gene - NCBI - NIH

1 downloads 119 Views 3MB Size Report
and William McGinnis1. Departments of ... etal., 1985; caudal: MacDonald and Struhl, 1986; ... 1985; MacDonald et al., 1986; Frasch et al., 1987). The same.
The EMBO Journal vol.7 no.7 pp.2151 - 2161, 1988

A novel, tissue-specific, Drosophila homeobox

Mark Barad, Thomas Jack1, Robin Chadwick' and William McGinnis1 Departments of Human Genetics and Molecular Biophysics and

Biochemistry', Yale University, New Haven, Connecticut 06511, USA Communicated by M.Noll

The homeobox gene family of Drosophila appears to control a variety of position-specific patterning decisions during embryonic and imaginal development. Most of these patterning decisions determine groups of cells on the anterior-posterior axis of the Drosophila germ band. We have isolated a novel homeobox gene from Drosophila, designated H2.0. H2.0 has the most diverged homeobox so far characterized in metazoa, and, in contrast to all previously isolated homeobox genes, H2.0 exhibits a tissue-specific pattern of expression. The cells that accumulate transcripts for this novel gene correspond to the visceral musculature and its anlagen. Key words: homeobox/visceral mesodermlH2. 0/Drosophilal expression

Introduction Many homeotic selector and segmentation genes of Drosophila appear to belong to a single, highly diverged, gene family, whose most conspicuous region of similarity is in an 180 bp sequence called the homeobox (McGinnis et al., 1984a,b; Scott and Weiner, 1984). The homeoboxes of homeotic selector genes encode amino acid domains of -60 residues (homeodomains) that are 60-90% identical in pair-wise comparisons (Regulski et al., 1985; Kuroiwa et al., 1985. Other members of the family include homeodomains with 40-80% identity, and perform a variety of pattern formation functions in Drosophila, determining the identity of groups of cells on the axes of the embryo. Thus, the presence of a homeodomain correlates with a pattern formation function in Drosophila. However, computer-aided comparisons have revealed that distantly related homologs of the homeodomain (=25% identity) are encoded by the yeast mating type genes, al and a2, which determine cell identity in this pattemless, single-celled organism (Shepherd et al., 1984; Laughon and Scott, 1984). In Drosophila, 20 homeobox-containing transcription units have been described (McGinnis et al., 1984b; Scott and Weiner, 1984; Fjose et al., 1985; Kuroiwa et al., 1985; Levine et al., 1985; Mlodzik et al., 1985; Poole et al., 1985; Regulski et al., 1985; Frigerio et al., 1986; Bopp et al., 1986; Doyle et al., 1986; Hoey et al., 1986; MacDonald and Struhl, 1986). Fourteen of these loci have known genetic functions that control various position-specific determinative decisions; the genetic functions of the -

©IRL Press Limited, Oxford, England

gene

remaining six are either unknown or as yet unreported. Seven of the fourteen with known genetic functions act as homeotic selectors, specifying the identity of individual segments (i.e. Sex combs reduced (Scr): Lewis et al., 1980; Antp: Le Calvez, 1948; Ultrabithorax (Ubx): Lewis, 1978; Abdominal A and Abdominal B: Sanchez-Herrero et al., 1985; Karch et al., 1985; caudal: MacDonald and Struhl, 1986; Deformed: Regulski et al., 1987; Merrill et al., 1987). Five act in the specification of the number and polarity of the body segments (i.e. fiushi tarazu, even skipped (eve), paired, gooseberry: Niisslein-Volhard and Wieschaus, 1980; engrailed: Morata and Lawrence, 1975; Kornberg, 1981). Of the remaining two genes, one (bicoid) assigns anterior co-ordinate values in the early embryo (Frohnhofer and Niisslein-Volhard, 1986). The other (zerknuillt), whose normal expression is required for proper development of the amnioserosa, may assign either extreme posterior or extreme dorsal identity in the embryo (Wakimoto et al., 1984). The position-defining activity of these Drosophila homeobox genes must affect many different cell types and tissues in the regions of their expression. The yeast mating type loci appear to have a much simpler genetic function, acting individually and in combination to determine the different yeast cell types (MacKay and Manney, 1974a,b). In this paper, we report the isolation and characterization of a new homeobox gene which has not only the most diverged homeodomain so far characterized in multicellular organisms but also an expression pattern specific to all precursors of a single tissue. This gene may, therefore, be performing a genetic role, that of specifying a single tissue, which is both functionally and evolutionarily intermediate between the roles of previously known Drosophila homeobox genes and those of the yeast homeobox homologs.

Results Isolation of a new homeobox gene The homeobox sequence from the Scr locus hybridizes to numerous restriction fragments in genomic Southern blot experiments, the most of any known homeobox sequence (Figure 1). All but one of these hybridizing fragments have previously been isolated as clones in genomic 'walks' or in screens for homeobox-containing genes. To isolate this remaining Drosophila homeobox, which resides on a 2.0 kb EcoRI fragment, we screened 150 000 plaques of a XgtlO library of EcoRI digested OregonR genomic DNA at low stringency with the 400 bp MluI fragment which contains the Scr homeobox. As expected, the only novel clone isolated in this screen contained a region of homeobox homology on a 2.0 kb EcoRI fragment (Figure 1). We used an EcoRI-PstI fragment of this novel clone, containing the 3' end of the homeobox, to screen 500 000 plaques of a 3-12 h embryonic cDNA library, kindly provided by Larry Kauver, at high stringency. This screen yielded 11 positive clones. 2151

M.Barad et al.

- -- 6,-id3 -' -i_

4.8.y

-

..

Ad-B Ab -D

v-N

'0 'i--- , .- -! T-

ft

--

:, -

!.

fr

-

4

3

M-?>

1I

': _-

h!1]a _ioI1-

,l _~~~~~~~~~~~~~~~~~. _

Scr Fan -,,/

_

Antp -,

I :e"Lr':.)>':

Lc

1.:f

r-

c

-~

a. ...

j_ _,

_ _-

A

S>

/

v

?s

(

/

b

Scr homeobox low stringency

H 2.0 high stringency

Fig. 1. Cloning of H2.0. Genomic OreR DNA was prepared as described in Materials and methods, digested with EcoRI, electrophoresed and blotted onto nitrocellulose. The left panel shows an autoradiographic exposure of the blot after hybridization with a 400 bp fragment containing the homeobox sequence from the Scr locus (Kuroiwa et al., 1985). Many of the fragments hybridized have been cloned and identified and are designated by genetic locus and size in kilobases (Regulski et al., 1985; Levine et al., 1985; Poole et al., 1985; MacDonald et al., 1986; Frasch et al., 1987). The same fragment was used to probe 500 000 plaques of a XgtlO library of EcoRI digested OregonR genomic DNA fragments. The right panel shows that a subcloned restriction fragment from one of the positive clones hybridized, at high stringency, only to the H2.0 band on the genormic Southern blot.

The longest cDNA inserts, found in six of the clones, had identical restriction maps and were all 1.7 kb in length. One of these, Xc5, was selected for more detailed structural and functional study. In lieu of a mutant phenotype, our preliminary designation for this homeobox locus, H2.0, reflects its homeobox homology and its original detection on a 2.0 kb genomic EcoRI fragment.

Fig. 2. Physical mapping of H2.O. (a) Molecular map of H2.O. The restriction site map of the 1680 bp cDNA clone, pG2.7, appears in the center of the figure. Below is the map of 28.3 kb of genomic DNA cloned from a CantonS library using pG2.7 as the probe. The extent of the most outlying clones is indicated below the map. Restriction sites below the genomic map were determined from those clones. The cDNA probe hybridized to two contiguous genomic fragments of 5.1 and 2.6 kb. Mapping gels and Southems of subclones of those fragments further restricted the possible regions of the cDNA to the stippled regions boxed above the genomic map. Restriction sites above the line were derived from cDNA subclones and do not reflect an exhaustive set of sites for those enzymes throughout the genomic region. Intron-exon boundaries have not yet been precisely determined, nor has the definitive 5' end of the transcription unit. The open arrow at

the top summarizes the sequence structure of the cDNA

clone, which contains a single extended ORF. A conceptual translation of this open reading frame from the optimal consensus start site (which is the second ATG of the long ORF, and is indicated on the arrow) yields a 410 amino acid protein with a diverged homeodomain and a poly-glutamine/poly-histidine region encoded by an opa or M (CAX) repeat. (b) In situ mapping of a biotinylated cDNA probe to a wildtype Drosophila salivary gland squash. The arrow indicates the signal, which obscures the 26B1-2 doublet on the left arm of the second chromosome.

out the possibility that our cDNA is not full length, and that micro-exon containing the true 5' end of this gene is located

Molecular characterization

a

We used the cDNA to isolate the genomic clones encoding its transcription unit by hybridizing it at high stringency to 500 000 plaques of a CantonS genomic DNA. In this screen we isolated eight overlapping genomic clones. Seven of these included a 2.6 kb EcoRI fragment which hybridized to our homeobox specific probe, and five included part or all of a 5.0 kb EcoRI fragment which hybridized to our cDNA probe. No other clones were identified by hybridization to our cDNA clone, although we have not definitively ruled

elsewhere. The 2.6 kb size of the homeobox containing EcoRI fragment in CantonS represents a polymorphism in EcoRI sites between the sequences of the CantonS and OregonR chromosomes. The map of two genomic clones which span the greatest extent of this region is shown in Figure 2a, along with maps of the cDNA clone and of its open reading frame (ORF). We have cytogenetically localized H2.0 on polytene chromosomes by in situ hybridization and enzymatic

21 52

A novel, tissue-specific, Drosophila homeobox gene

cc Nc

(D

c\o 0

\j co

a

C\JC"

:L

C

N C,,

c

..w

log

._

--

-

4

_

e-

.85 k

Actin

Fig. 3. Northern analysis of H2.0 expression. Poly(A)+ RNA from 0-6, 6-12 and 12-24 h embryos as well as first (1st), second (2nd) and third (3rd) instar larvae, pupae and adults was prepared, electrophoresed and blotted as described in Materials and methods. The RNA blot was probed with nick-translated pG2.7 DNA, washed under stringent conditions and exposed to X-ray film to detect transcripts derived from the H2.0 gene. The approximate amount of poly(A)+ RNA in 3rd instar, pupal, and adult stages is approximately twice that of the other stages, which can be estimated from a separate probing of the blot with the genomic 6C actin gene, whose transcript levels are roughly equivalent at all developmental stages tested (Fyrberg et al., 1983). The size of the major transcript (1.85 k kilobases) is indicated.

detection of a biotinylated cDNA probe (Langer-Safer et al., 1982). As is shown in Figure 2b, the H2. 0 locus maps to 26B1-3. Similar in situs to mutant chromosomes mapped the cDNA hybridization to the right of the GdhA deficiency whose proximal breakpoint maps to 26A9 (Kotarski et al., 1983), and to the left of the T(Y;2)D211 translocation, which maps to 26B3 (Lindsley et al., 1972; A.Erlebacher and M.Barad, unpublished results). When the cDNA clone was used as a probe for an allstage OregonR Northern blot (Figure 3), it hybridized to a single major transcript with an apparent size of 1.85 kb. This transcript class is present throughout development beginning at 6 h of embryogenesis, with its greatest overall abundance in the period from 6 to 24 h. Hybridization searches of four third instar and one 12-24 h embryonic cDNA libraries did not identify any cDNAs corresponding to the weak signals at higher mol. wt (F.B.Johnson and M.Barad, unpublished results). Using single and double stranded dideoxy-nucleotide sequencing, we have determined the entire sequence of the cDNA clone (Figure 4). The 1680 nt cDNA includes a single long, methionine codon initiated, open reading frame of 1257 nt, which is in frame with a highly diverged homeobox and an M or opa (CAX) repeat (McGinnis et al., 1984a; Wharton et al., 1985). An initiation codon at bp 233

represents the most likely start for translation, based on a 5/6 match to the eukaryotic translational start sequence consensus (Kozak, 1986). The conceptual translation from this site would yield a 410 amino acid protein. The CAX repeat, located near the center of the ORF at bp 770, spans 75 bp and includes codons for histidine (12) and glutamine (9), as well as four non-CAX codons. As in most homeoboxcontaining genes, the H2. 0 homeobox is located nearer the 3' end of the gene, starting at bp 1091. The translated homeodomain sequence is the most diverged so far described, with less than 40% identity over 66 amino acids to its nearest Drosophila relatives, eve and z2. The cDNA sequence does not include significant similarity to the PRD repeat or paired box (Frigerio et al., 1986; Bopp et al., 1986), nor does it have the Y-P-W-M peptide which is conserved upstream of many homeodomains of the homeotic selector subgroup in Drosophila, as well as in some homeodomain-containing proteins of vertebrates (Mavilio et al., 1986). Embryonic expression To study the expression pattern of H2.0 in embryos, we hybridized staged, paraffin-embedded, and sectioned, Drosophila embryos with 35S-labeled RNA probes transcribed in vitro in both anti-sense and sense directions from an H2.0 cDNA. In no case did the sense probe yield any specific hybridization (data not shown). The results of anti-sense hybridization are described below. (All stage numbers and nomenclature follow Campos-Ortega and Hartenstien, 1985.) We first detect H2. 0 expression at embryonic stage 10, at 5 h of embryonic development, at the point of flexure where the invaginating posterior midgut primordium (PMG) folds back against the extended germband. Figure 5 (a and b) shows a stage 9 embryo of 4 h 20 min, just before the first expression. Figure 5 (c and d) shows the first pattern of expression, during stage 10 of embryogenesis. This pattern extends as a bar of expression in the mesodermal layer of the germ band, which crosses the midline [Figure 5 (h and i), cross section on the left], but does not extend more anteriorly in the germ band, even in the lateral regions (data not shown). A second pattern of expression replaces the first by midstage 11, -6 h 20 min of development, as the PMG continues to move dorsally and anteriorly (towards the posterior pole of the egg, however, since the germ band is fully extended at this stage, and the posterior-most embryonic coordinates extended forward along the dorsal side of the egg). At this stage, H2. 0 expression appears in two lateral bands, in the single layer of mesodermal cells nearest the yolk, along the entire length of germ band from the posterior midgut to the anterior midgut invagination [Figure 5 (e-i), cross section on the right]. At this stage there is no midline expression detectable anywhere, even in the location of stage 10 expression. No embryo that we have examined through serial cross-sections has both midline labeling and the extended lateral expression characteristic of stage 11. We have no evidence to determine whether this change in pattern is due to migration of expressing cells from the midline or to a shut-down of expression in some cells. During late stage 11, at - 7 h of development, tracheal pits appear in the lateral ectoderm. At about the same time, H2. 0 transcripts begin to accumulate in a small subset of ectodermal cells, just lateral to the tracheal pits [Figure 6 (a-c)]. This ectodermal expression is also restricted -

-

2153

M.Barad et al. 75 50 25 AGTTCTGAGT GGGACGTCTT GCAAGGCGGT TTAATCGGAG GTGCACGAGA CCTAAGCCAA AATTCGCGAT CGTTTTATTG GTGCACAGCG 175 150 125 100 ACGGTGTTTT GGACATGATA GATAGTCCCT ATCGATTGTG AACGTGTGAA ATCAGTTTTG AGTGTCCTGT GTCGAGATTC GTAGATAGAA 250 225 200 ACCATATACA ACACATATCC GGACA ATG TTA CTA CAC GAG AGT GCC GCC AGC ATG GAA CAG AGT ATG CCC GAG AAC CTC Met Leu Leu His Glu Ser Ala Ala Ser Met Glu Gln Ser Met Pro Glu Asn Leu> 325 300 275 AGC ACT CAC ATG TAC GGC GAG TGT GAA GTG AAT CCA ACG CTG GCA AAG TGT CCC GAT CCA GTG AAT GTC GAT CAT Ser Thr His Met Tyr Gly Glu Cys Glu Val Asn Pro Thr Leu Ala Lys Cys Pro Asp Pro Val Asn Val Asp His> 400 375 350 GAG CTG CCC ACC AAA GAA TCC TGT GCC TCG ACT ACC ATA GTC AGT ACG TCG CCA ACC AGC GCC ACG AGC ACC ACC Glu Leu Pro Thr Lys Glu Ser Cys Ala Ser Thr Thr Ile Val Ser Thr Ser Pro Thr Ser Ala Thr Ser Thr Thr> 475 450 425 AAA GTC AAG CTG AGC TTC AGT GTG GAT CGT CTG TTG GGC TCG GAA CCC GAG GAA TCC CAC CGG CAG AGC TCC TCC His Arg Ser Ser Ser> Pro Gln Glu Glu Ser Lys Val Lys Leu Ser Phe Ser Val Asp Arg Leu Leu Gly Ser Glu

550 525 500 TCG CCC TCG ACC AAG TCA TGC TGC GAT GGC AGC ATT CTA GCC TGC TGC TCC TTC CCC CAC TGC TTC AGC CAG GCG Ser Pro Ser Thr Lys Ser Cys Cys Asp Gly Ser Ile Leu Ala Cys Cys Ser Phe Pro His Cys Phe Ser Gln Ala>

625

600

575

AAT GCC GAA TCC CGT AGG TTT GGA CAT GCC ACG CTT CCG CCC ACT TTC ACA CCC ACT TCA TCG CAC ACG TAC CCC Asn Ala Glu Ser Arg Arg Phe Gly His Ala Thr Leu Pro Pro Thr Phe Thr Pro Thr Ser Ser His Thr Tyr Pro>

700 675 650 TTC GTG GGT CTG GAT AAG CTG TTC CCT GGC CCC TAT ATG GAT TAC AAA TCA GTG CTC AGA CCG ACG CCA ATT CGG Phe Val Gly Leu Asp Lys Leu Phe Pro Gly Pro Tyr Met Asp Tyr Lys Ser Val Leu Arg Pro Thr Pro Ile Arg> 775 750 725 GCA GCA GAA CAC GCC GCA CCC ACT TAT CCC ACG CTG GCC ACC AAT GCG CTC CTC CGC TTC CAT CAA CAT CAG AAG Ala Ala Glu His Ala Ala Pro Thr Tyr Pro Thr Leu Ala Thr Asn Ala Leu Leu Arg Phe His Gln His Gln Lys> 850 825 800 CAG CAG CAC CAG CAG CAT CAT CAT CAT CAG CAC CAT CCC AAA CAC CTC CAT CAG CAG CAC AAG CCC CCG CCA CAC His Lys Pro Pro Pro His> Gln Gln His Gln Gln His His His His Gln His His Pro Lys His Leu His Gln Gln 925 900 875 AAC TCG ACG ACA GCC ACG GCG CTA CTG GCG CCG CTC CAC AGC CTC ACG AGC CTC CAG CTG ACG CAA CAG CAG CAG Asn Ser Thr Thr Ala Thr Ala Leu Leu Ala Pro Leu His Ser Leu Thr Ser Leu Gln Leu Thr Gln Gln Gln Gln> 1000 975 950 CGA TTT CTG GGC AAG ACG CCG CAG CAG CTC CTG GAC ATA GCG CCC ACC TCG CCC GCC GCC GCC GCC GCC GCA GCG Ala Ala Ala Ala Ala> Pro Ala Ala Ala Pro Thr Ser Ile Pro Leu Leu Gln Asp Gln Arg Phe Leu Gly Lys Thr 1075 1050 1025 ACA TCC CAG AAC GGT GCA CAT GGA CAT GGA GGT GGC AAT GGC CAG GGC AAC GCC TCA GCC GGA AGC AAT GGG AAG Thr Ser Gln Asn Gly Ala His Gly His Gly Gly Gly Asn Gly Gln Gly Asn Ala Ser Ala Gly Ser Asn Gly Lys> 1150 1125 1100 CGC AAG CGA TCC TGG TCT CGC GCC GTC TTC TCG AAT CTG CAG CGC AAG GGC CTG GAG ATT CAG TTC CAG CAG CAG Arg Lys Arg Ser Trp Ser Arg Ala Val Phe Ser Asn Leu Gln Arg Lys Gly Leu Glu Ile Gln Phe Gln Gln Gln>

1225 1200 1175 AAG TAC ATC ACC AAG CCA GAT CGC CGC AAG CTG GCG GCA CGA CTG AAT CTC ACC GAC GCT CAG GTC AAG GTG TGG Lys Tyr Ile Thr Lys Pro Asp Arg Arg Lys Leu Ala Ala Arg Leu Asn Leu Thr Asp Ala Gln Val Lys Val Trp> 1300

1275

1250

TTC CAG AAC CGT CGC ATG AAG TGG CGG CAC ACG CGC GAG AAT CTA AAA AGT GGC CAG GAG AAG CAG CCG AGT GCA Phe Gln Asn Arg Arg Met Lys Trp Arg His Thr Arg Gin Asn Leu Lys Ser Gly Gln Glu Lys Gln Pro Ser Ala>

1375

1350

1325

GTA CCC GAA TCC GGA GGT GTC TTC AAG ACA TCC ACG CCA TCC GGT GAT GGT GCG CCC CAG GAG GCA CTC GAC TAT Val Pro Glu Ser Gly Gly Val Phe Lys Thr Ser Thr Pro Ser Gly Asp Gly Ala Pro Gln Glu Ala Leu Asp Tyr>

1450 1425 1400 AGT TCC GAT AGC TGC TCC AGT GTG GAT TTG AGC GAG CAG GCC GAC GAA GAT GAT AAT ATT GAA ATC AAT GTG GTG Ser Ser Asp Ser Cys Ser Ser Val Asp Leu Ser Glu Gln Ala Asp Glu Asp Asp Asn Ile Glu Ile Asn Val Val> 1525

1500

1475

GAG TAG AGCCG CCTATGTAGA TAGCTCCGTT TACACGTGTA AAATGCCATG TAGATGTATT TATTCCTTCA TTGCTAGTTG

Glu End>

1600

1575

1550

1625

TAATGTAATG TAAAGTAGAC AGATACTGCA CAAAGTGAAC TTAAATTATT TCAATCTGTA AGATAATCTA AACGTTTATG TGTAAACTAT

16 50

16 75

TTAAAATAAA TTCTCTAGTA AAATGAACTT GTGAAAAAAA AAAAAAAAAA

Fig. 4. Sequence of an H2. 0 cDNA. The sequence of both strands of the pG2.7 cDNA was determined by the dideoxy-chain termination method using subclones in pBluescrybe for double-stranded and in mpl8 for single-stranded sequencing (Messing, 1983). The sequence is shown with a conceptual translation of the longest open reading frame. The boxed ATG represents the most likely start of translation based on a 5/6 match to the eukaryotic translational start sequence as described by Kozak (1980). The overlined region starting at base 770 is a 75 nt CAX (opa or M) repeat including 21 CAX triplets coding for histidine (12) and glutamine (9) (Wharton et al., 1985; McGinnis et al., 1984; Regulski et al., 1985). The solid underlining indicates the homeobox homology. The extended homology in many other homeodomains of one amino acid upstream and five downstream is indicated by broken underlining. The overlined region at 1635-1640 is the presumptive polyadenylation signal sequence.

2154

A novel, tissue-specific, Drosophila homeobox gene

N~~~~~~~~~~~~~~~~

*1

't,

,.t

4~ ~ ~~~~~~~~~~~4

A

Lie

N

~ ~~A

-L

¢e

,~~~~ \a. f .

*

v

Fig. 5. Localization of early expression of H2.0 by in situ hybridization. In Figures 5 and 6, embryonic tissue sections were prepared as described in Materials and methods and probed wtih anti-sense RNA prepared in vitro from pG2.7 linearized in the polylinker with HindlIl. Sense strand probes were also prepared and hybridized to parallel sections as a control, but showed no specific hybridization (data not shown). Except in cases of cross-sections as specified below, all embryonic sections are para-sagittal and are aligned with anterior to the left and dorsal up. Embryonic staging and timing follow the scheme of Campos-Ortega and Hartenstein (1985). Most sections are underexposed to allow visualization of cellular morphology behind the grains so that the signal is more easily seen on the accompanying darkfield photographs. (a) A late stage 9 embryo of -4 h 20 min, in which many neuroblasts have segregated from the ectoderm and the proctodeal opening has reached 70% egg length, shows no expression above background in darkfield of same section (b). (c and d) Late stage 10 embryo of -5 h 20 min showing earliest H2.0 expression, which occurs in the mesoderm at the point of flexure between the invaginating posterior midgut (pmg) and the extended germ band. (e and f) The pattern of expression changes by mid-stage 11 (6 h 20 min). At this stage expression occurs in the single cell layer of visceral mesoderm in continuous bands on either side of the midline, running the entire distance from posterior midgut invagination to anterior midgut invagination (amg). [Though the hybridization signal seems interrupted in (f), this is an artifact of the plane of section, and parallel sections show that it is, in fact, continuous. Data not shown.] (g) A close-up of panel (e) shows that the grains of the signal are concentrated over the single layer of cells most closely adjoining the yolk sac (black globules). (h and i) Cross-sections of a stage 10 (left) and a stage 11 (right) embryo illustrate the distinct patterns of expression at these two stages. Dashed lines in panels (c) and (e) indicate the approximate planes, respectively, of these sections. 'D' and 'V' indicate the dorsal and ventral orientation of each embryo. Stage 10 expression is at a single anterior-posterior locus, the point of flexure between pmg and germ band, and crosses the midline. More anterior and posterior parallel sections show no expression (data not shown). Stage 11 expression appears as two continuous bands on either side of the midline in the visceral mesoderm. (Four areas of signal appear because the extended germ band appears on both the ventral and, folded back, on the dorsal aspects of the embryo.) st, stomodeum.

2155

M.Barad et al.

AK.

>-kQ>Jq

ws

S"^'

'eA'Sp

sp11g

H

Fig. 6. Later expression of H2.0 by in situ hybridization. (a and b) At late stage 11, -7 h, tracheal pits (tp and arrow) appear in the ectoderm. At the same time a new pattern of H2.0 expression appears, in laterally and anterior-posteriorly restricted regions of the ectoderm, while mesodermal expression continues. (c) A close-up of the region around the tracheal pit of panel (a) shows that the region of ectodermal expression lies just lateral to the pit. (d and e) At germ band retraction, stage 13, 10 h, embryos show continuation of both the mesodermal and the ectodermal expression patterns. This para-sagittal section shows that the ectodermal transcripts accumulate only in the posterior compartment of each segment. Although the signal is not obvious in every segment of this photograph due to the plane of section, it can be detected in the clypeolabral, mandibular, and maxillary lobes, as well as in all segments from TI to A8 in parallel sections of this embryo (data not shown). This stage also shows the most characteristic morphology of the visceral musculature precursors, the double palisade structure of the splanchnopleura (sp). (f and g) In higher magnifications of (d) and (e), the mesodermal signal is clearly restricted to the distinctive layer of the sp, and the ectodermal signal to the posterior compartment of each segment. This magnification also shows there is a narrow line of cells accumulating transcript that link each region of posterior compartment expression to the mesodermal expression. (h and i) At stage 15, 12 h, while dorsal closure is proceeding, the combined mesodermal and ectodermal expression continue. The mesodermal expression has spread and flattened along with the visceral muscles which become a thin layer applied continuously around the midgut, and the ectodermal expression remains restricted to a very small percentage of the dorsal-ventral axis. Arrows indicate hindgut (hg) and dorsal vessel (dv) which show no H2.0 expression. ( and k) Higher magnification of the area indicated by the hollow arrow in panel (h) illustrates the narrow strand of expressing cells linking the ectodermal with the mesodermal expression. The black arrow indicates that the cells in this region have small nuclei, unlike most of the ectodermal cells at this stage. (I and m) Stage 17 embryo of - 18 h shows nearly completed organogenesis, with easily discernable pharyngeal structures (ph), proventriculus (pv), sinuous midgut, and hindgut (hg). Internal 12.0 expression occurs only around structures with musculature derived from visceral mesoderm and not around hindgut or pharynx. Signal appears only around the midgut side of the proventriculus, indicating this structure as a point of transition. The ectodermal expression continues at this stage, and in addition, one isolated area of ectodermal expression appears in dorsal anterior TI, a region apparently unrelated to the earlier loci of ectodermal H2.0 expression. Abbreviations: a4 = fourth abdominal segment, lb = labial lobe, mg = midgut endoderm, pmg = posterior midgut invagination, tl = first thoracic segment, vnc = ventral nerve cord, y = yolk.

21 56

A novel, tissue-specific, Drosophila homeobox gene

MAToc2, yeast BSH-9, Dros. BSH-4, Dros. paired, Dros.

16

caudal, Dros. even-ski pped, Dros. zerknullt, Dros. z2, Dros.

-23

fuhi -tarazu, Dros. Hox- 1.4, mouse HHox-c 1 3, human L Xhox- 1 A, Xenopus '...... Deformed, Dros. 9

mouse & IHox- 2. 1 , human

Sex combs reduced, Dros. HHox-c 1, human r 6 - Antennapedia, Dros. MM3, Xenopus Hoxmouse 16-13-10-8 25 40-39-38 31-30 mnim p v IIIHv I nux- i.t-, iiuuuc HB 1, sea urchin L.........

I

48

1.1I,

HHox-c8, human

Ultrabithorax, Dros. Hox-2.2, human Hox- 3.1, mouse

16

11 I-_________________

engrailed, Dros. i nvected, Dros. En- 1, mouse

bicoid, Dros. H 2.0, Dros.

I

1MAT a I, yeast

Fig. 7. Similarity tree of known homeodomains and yeast relatives. Most of the published homeodomain sequences are arranged in the above figure in nested sets based on the number of divergent residues in all cross comparisons between amino acids from - 1 to 65 (for the numbering scheme, see Figure 8). Numbers at the branch points indicate the minimum number of changes between members of the divided branches. H2.0 has the most diverged homeodomain so far characterized, with a minimum of 40/66 changes between it and its nearest relatives. Amino acid sequences were taken from the following references: Mat a2 and Mat al, Astell et al., 1981; BSH-4 and BSH-9, Baumgartner et al., 1987; paired, Frigerio et al., 1986; caudal, Mlodzik et al., 1985; even skipped, MacDonald et al., 1986; zerkullt and z2, Rushlow et al., 1987; fushi tarazu, Laughon and Scott, 1984; Hox-1.4, Doboule et al., 1986; HHox-cJ3, Mavilio et al., 1986; Xox-lA, Harvey et al., 1986; Deformed, Regulski et al., 1987; Hox-2. 1, Krumlauf et al., 1987; Sex combs reduced, Kuroiwa et al., 1985; HHox-cl and HHox-c8, Simeone et al., 1987; Antennapedia, Schneuwley et al., 1986; MM3, Muller et al., 1984; Hox-1. I and Hox-1.2, Colberg-Poley et al., 1985; HB1, Dolecki et al., 1986; Ultrabithorax, Weinzeierl et al., 1987; Hox-2.2, Hauser et al., 1985; Hox-3. 1, Awgulewitsch et al., 1986; engrailed, Poole et al., 1985; invected, Coleman et al., 1987; En-i, Joyner and Martin, 1987; bicoid, Frigerio et al., 1986, and M.Noll, personal communication; H2.0, this paper.

longitudinally; in sagittal sections of late stage 11I embryos, transcripts appear periodically, just anterior to the tracheal pit in each parasegment (data not shown). We reconstruct this pattern in three dimensions as a small patch of expression in each parasegment just lateral and anterior to the tracheal pit. Both the mesodermal and the ectodermal

patterns are maintained throughout the rest of embryonic development. At germ band retraction [Figure 6 (d and e)], the visceral mesoderm, having divided into a two cell layer called the splanchnopleura, takes on a highly characteristic morphology, as the cells of these two layers elongate and interdigitate. This morphology is easily distinguishable from 2157

M.Barad et al. 1

UBX ANTP SCR H2 .0 EVE

Z2 MHOX 3-1 MHOX 2-4 AC-1

YEASTiC2 YEAST al

10

40

30

20

50

60

I E3 E L pilR R G R Q T Y T R Y Q ~~~TL E L E K E FH TNH_LTRRRRIEMAH A LE C L T E R _I KI WFQN RR1 tK E R G R Q T Y T R Y Q T L E L E K E F H F N R L T R R R R I E I A H A L C L T E R Q I K I W F Q N R R 14 Kt 1 K K E N K T K G E Q T K R Q R T S Y TRYQT L E L E K E F H F N R Y L T R R R R I E I IA H A L C L T E R Q I IK I lit F Q N R R M 1: W K K E H K M A S M

rrK

R S XI S R A V F S N L Q R K G L

I QF

K

Y I T K P D R R K L A A R L N L T D A Q V K V 11 F Q N R R M K w R 1[ T 11 It 11 L K s

R R,G G N LLL LF FYKEY I,KVw R R M X|D K R QLI A V A W S~~ ~ ~~~~r 7 L Q N(DR M K1L K K S T N R K G A TE R|Q V KCm EXSK R R T ARF S EL E L I E L E R E |F H L N Y L A R T R R I E s QRL;L G R T R Q T S R Y RQ V K IN F Q N R R I K K E N N K D L T L E L E K E F L F N PY L R K E V S H LGL

T|E M R G| Q T Y|S R Y Q T L E L E K E F L F N PIY L|T|R K R|RI E V S H A|LIGIL T|E R Q V X|I N F Q N R R M I N K K E N N K D F D RR R GE Q I,Y R Y M T L E L E K E F H F N R|Y L TR R E IN A C L E R Q K I N F Q N R R N KASK K E T K P Y R G H R T K E N V R I L E S W FP A K N-PIE L D T K G L E N MK N T S ;;R I aIQIIINN R R KT I T IAPE L j Q V IE. R R K Q S L N S K E K E E S K KC GIL P W FIR| LIQ W F|I

GIR|R

M

ElS

R|R|I

RDRI

E

VaK

Fig. 8. Comparison of H2.0 homeodomain to other homeodomains. The amino acid sequence of the H2.0 homeodomain from - I to 65 is aligned with a subset of other homeodomain sequences. Amino acids identical to H2. 0 sequence are shaded and boxed. Above the H2. 0 sequence appear Antp and Ubx as examples of the most conserved class of homeodomains, and Scr which was used for cloning H2.0 (McGinnis et al., 1984b; Kuroiwa et al., 1985). Immediately below H2.0 appear the homeodomains of its nearest relatives, from eve, z2, two mouse genes and one Xenopus gene, as well as the regions of similarity in the homeodomains of the yeast mating type proteins (MacDonald et al., 1986; Rushlow et al.. 1987; Awgulewitsch et al., 1986; Hart et al., 1987; Carrasco et al., 1984; Astell et al., 1981). The asterisk in the yeast a2 sequence indicates the location of a 3 amino acid deletion (I E N) made to maximize the amount of its similarity to the homeodomain. H2.0 contains the most diverged Drosophilca homeodomain so far described with only 39.4% identity over the 66 amino acids of the extended homeodomain to its nearest neighbors. Although most of these conserved amino acids appear in the putative DNA binding region of the carboxy terminus, we note that several highly conserved amino acids appear throughout the amino part of the domain, and show a periodicity of 3-4 residues (e.g. residues 1, 4, 8, 11, 15/16 and 20), supporting the hypothesis (P.O'Farrell, personal communication) that such identical amino acids may represent a conserved face of one, or more, alpha-helices.

that of the larger and rounder cells of both the endodermal and the somatic mesodermal layers. At higher magnification [Figure 6 (f and g)], transcripts of H2. 0 can be detected throughout the region of tissue with this specific morphology while the adjoining tissues show no signal over background. At this stage, one can easily detect the ectodermal expression, which is clearly restricted to the posterior region of each segment extending into the segmental groove. The section in these photographs does not show all such regions of expression, but parallel sections of the same embryo, and of others (data not shown) reveal that H2. 0 expression occurs in the restricted regions of the mandibular, maxillary, and clypeo-labral lobes of the head as well as in all thoracic segments and abdominal segments through A8. At this stage, it is also evident that there is a line of expressing cells which connect each patch of ectodermal expression with the expression in the splanchnopleura (Figure 6g). As dorsal closure occurs, the yolk is completely surrounded by gut endoderm, which forms the digestive apparatus. The visceral musculature forms as the cells of the splanchnopleura flatten out to form a thin sheet of cells enveloping the gut endoderm. In Figure 6 (h and i), it is clear that H2. 0 mesodermal expression occurs exclusively in midgut associated tissues, in the descendents of visceral mesodermal precursors only. No labeling is apparent around the hindgut or dorsal vessel, whose musculature derives from somatic mesoderm. At higher magnification [Figure 6 (j and k)], the expression which joins the patch of ectodermal expression with the mesodermal expression seems to coincide with a line of nuclei which are smaller than those of most surrounding cells. At later stages of embryonic development, as organogenesis proceeds, the mesodermal expression continues to be exclusively associated with midgut structures. Some H2. 0 expression appears on the midgut side of the proventriculus of the embryo, perhaps reflecting a mixed origin for the muscles of this structure. No detectable expression is associated with hindgut structures or pharynx, whose musculature is somatic in origin [Figure 6 (1 and m)]. Expression in the lateral ectoderm continues in an even more restricted region of the ventro-lateral region of each segment, perhaps in nuclei of a subset of apodemes. One new region of expression appears just anterior to the pharyngeal muscles 2158

[Figure 6 (1 and m)] in a band which appears to wrap laterally and dorsally around the anterior pharynx. We tentatively identify this as the region giving rise to the dorsal prothoracic pharyngeal muscle.

Discussion In a search for new Drosophila homeobox genes, we have isolated a previously unidentified locus, H2. 0, by homology at low stringency to the Scr homeobox region. The region of homology to Scr in H2.0 includes the most diverged metazoan homeobox so far described, with only a 51% nucleotide similarity to the Scr sequence. (We have extended our homeobox comparisons to 198 bp and 66 amino acids to reflect conservation outside the original homeobox in close relatives, like the Deformed group.) H2. 0 codes for an extremely diverged homeodomain, with less than 40% amino acid similarity to its nearest relatives (Figure 7). Though very diverged, this homeobox gene shares the characteristic of a temporally- and spatially-restricted expression pattern in common with all the other known homeobox genes. However, the extreme divergence of the homeobox of H2. 0 correlates with a novel pattern of expression among homeobox genes: a pattern which includes all the precursors of a single tissue type. This suggests that H2. 0 may perform a novel genetic function among homeobox genes as well, that of specifying the morphogenesis of a single tissue type. The 51 % similarity of the H2. 0 homeobox sequence to Scr marks this as an extremely diverged sequence at the nucleotide level. The ability of the Scr homeobox to hybridize to this gene probably depends on an homologous stretch of 25 nt with a single mismatch at the 3' end of the homeobox, putatively coding for the most highly conserved peptide sequence of the homeodomain: W-F-Q-N-R-R-M-K. The sequence of the H2.0 homeodomain is even more diverged with only 39.4 % identity, over the 66 amino acids from -1 to 65, with its nearest relatives. Because it is so diverged, a comparison of the conserved amino acids of this sequence with a variety of other homeodomains (Figure 8) may reveal something about the most important structural subdomains of this region. Most of the amino acid identity between H2. 0 and other homeodomains

A novel, tissue-specific, Drosophila homeobox gene

m

=

it Fx,resSon

Fig. 9. Schematic summary of H2.0 expression. Diagrammatic embryos are aligned as the in situs, with anterior to left and dorsal up. Numbers to the left of each embryonic outline refer to the stages of Campos-Ortega and Hartenstein (1985). At stage 10, first H2.0 expression occurs at the point of flexure between the posterior midgut invagination (pmg) and the extended germband in a group of cells that crosses the midline. At stage 11, a second pattern of mesodermal expression replaces the first. Cells of the visceral mesoderm (vm), the nearest cell layer to the yolk, express H2.0 in two bands on either side of the midline of the organism. In late stage 11 and later stages of germ band retraction and dorsal closure, H2. 0 is expressed in all the cells destined to form visceral musculature, and most dramatically in all the cells of the splanchnopleura (sp) with its distinctive double palisade structure. Also another, ectodermal, pattern of expression appears in a restricted lateral region of the posterior compartment of each segment. These two patterns are connected by narrow strands of cells with small nuclei which also express H2. 0. As organogenesis and head involution near completion at late stages, H2.0 expression is limited to a thin sheet of visceral muscle cells enveloping the midgut (but not hindgut nor pharyngeal structures), to a segmentally repeated structure of the ventro-lateral hypoderm and to a small patch of muscle in dorsal anterior T1. Abbreviations: al and a8 = abdominal segments 1 and 8, amg = anterior midgut invagination, as = amnioserosa, cl = clypeolabrum, hg = hindgut, mg = midgut endoderm, ph = pharyngeal structures, pmg = posterior midgut invagination, pr = proctodeum, pv = proventriculus, sg = salivary gland, sp = splanchnopleura, st = stomodeum, tl -t3 = thoracic segments 1-3, vm = visceral mesoderm.

is clustered at the carboxy-terminal region, as has been observed in previous cross-comparisons among homeodomains (Fjose et al., 1985). The region of identity in H2.0

includes only part of the putative ca-helix 3 of the helix-turnhelix DNA binding motif, which spans residues 42-50 of the homeodomain (Ohlendorf et al., 1983; McKay and Steitz, 1981). The amino acid changes which have occurred in the amino end of this putative helical region in H2. 0 are conservative ones except for the alanine residue at position 43 of the homeodomain where most other homeodomains have arginine. Since this places a hydrophobic side-chain in a location on the putative helix which would face the major groove of the DNA it may represent an important change. The similarity of H2.0 to the other homeodomains in the putative helix 2 homologous domain, which occupies residues 31-38 in the homeodomain, is even weaker. Of the eight residues in this region, only three to five are identical to those in other homeodomains, and the amino acid changes do not seem conservative, including changes from basic to hydrophobic (histidine to alanine at residue 36), hydrophobic to basic (isoleucine to arginine at residue 32 and alanine to arginine at residue 37), acidic to basic (glutamate to lysine at residue 33). The three most conserved residues, numbers 31, 35 and 38, have a spacing (4 and 3 residues) which suggests that a single face of an ahelix may be the basic structural element conserved between H2.0 and the other homeodomains in this region as has been suggested for the amino end of the homeodomain (P.H.O'Farrell, personal communication). This spacing of conserved residues is even more obvious at the amino terminal of the homeodomain, where in the region from -1 through 21 there are seven loci of identity with spacing of three or four residues, lending support to the hypothesis of strong conservation of a single face of one, or more, ahelices. This pattern of periodic identity extends even to the yeast homeodomains, so that, especially in the al protein, one can detect the suggestion of the same helical structural motif (especially if one allows the conservative replacement of arginine by lysine at residue 5). The H2. 0 expression pattern, as revealed by in situ hybridization with anti-sense RNA probes, is unique among homeobox containing genes in metazoa, in that at least one part of that pattern corresponds to the entire distribution of a single tissue type and its precursors. H2. 0 is first expressed in the single cell layer of the visceral mesoderm. It continues to be expressed in all the cells destined to become visceral musculature (Figure 9), as those cells divide into the two cell layers of the splanchnopleura, alter their shape to make the double palisade pattern characteristic of this tissue, and then flatten into the layer of visceral muscles which are closely applied to the sinuous gut. H2. 0 expression never appears in the tissues which adjoin the precursors of visceral musculature, either the gut endoderm or the somatic mesoderm and musculature. H2. 0 also has a second pattern of transcript accumulation, which appears in small groups of cells restricted both to the posterior region of each segment and to a small subset of the ventro-lateral region of the ectoderm. The restriction to a small patch of cells on the dorsal -ventral axis distinguishes the periodic expression of H2. 0 from that of a gene like engrailed, whose expression, though also restricted to the posterior compartment of each segment, completely rings the cellular blastoderm and germ layer, and thus shows no dorso-ventral reaction. We suggest that this hypodermal expression, combined with the expression in threadlike connections between the ectoderm and visceral mesoderm, delineates the anlagen of muscular anchors connecting gut 2159

M.Barad et al.

and hypoderm (Kirby and Beck, 1986). We hope to determine the nature of these structures expressing H2. 0 by means of antibody staining of whole-mount embryos. The H2. 0 locus maps to 26B 1-3. This region of the genome contains no known morphogenetic mutants. Homozygozity for a large deletion that includes the H2. 0 locus does result in embryonic lethality, which is no surprise, considering the amount of genetic material removed. Nevertheless, in these mutants the morphology of most epidermal structures and most internal organs appears to be normal. One striking exception to this generalization is the midgut, which never attains its normal sinuous tubularity, instead remaining as a balloon-like, yolk-filled sac (data not shown). We are presently isolating and characterizing lethal mutants in this region in order to test whether this phenotype results, in whole or in part, from the removal of H2. 0 function. The discovery of a highly diverged homeobox gene by screening with the Scr homeobox confirms the utility of using a variety of homeoboxes as probes for potential morphogenetic genes. Southern blot results indicate that H2. 0 would not have yielded a hybridization signal intense enough to guarantee its isolation using the Antp or Ubx homeoboxes as probes, and using the H2. 0 homeobox to hybridize to genomic Southerns suggests that it, in turn, may identify previously undiscovered homeobox-containing loci (data not shown). The evidence we have presented here indicates that homeobox homology may not only reveal new members of existing groups, but also the existence of entirely new groups of genes, defining new genetic functions whose only common feature may be their importance in the early steps of development. By analogy to the genetic function of other homeobox containing genes, whose proper expression is essential to the normal morphogenesis of the anlagen where they are expressed, we hypothesize that H2. 0 expression may be essential to the proper morphogenesis of the visceral musculature. It might, therefore, represent a member of a novel group of tissue-type determining genes which contain homeoboxes.

Materials and methods Southern blots and probes Drosophila genomic DNA from the OregonR-Munchen strain was isolated and blotted as described in McGinnis et al. (1984a). The Scr probe was the 400 bp MluI genomic fragment including the Scr homeobox (Kuroiwa et al., 1985), which was nick-translated, hybridized and washed under nonstringent conditions. H2.0 was originally cloned from a XgtlO library of OregonR genomic DNA using the same probe and hybridization conditions. The genomic Southern was probed with a nick-translated 650 bp EcoRI-PstI fragment of the genomic clone which included the region of H2. 0 homeobox homology under stringent conditions. Stringent and non-stringent conditions were as described in McGinnis et al. (1984a).

RNA isolation and Northern analysis RNA was isolated, electrophoresed and blotted as in Chadwick and McGinnis (1987). Approximately 10 ytg/lane of poly(A)+ RNA was electrophoresed through a 0.7% formaldehyde -agarose gel and transferred without treatment to nitrocellulose using 20 x SSC. Hybridization followed the protocol of McGinnis et al. (1984a) except that the hybridization solution contained 0.5% SDS and hybridization was performed at 55°C. Stringent wash conditions were 0.1 x SSC at 65°C.

Nucleotide sequencing The nucleotide sequence of the pG2.7 cDNA was determined by the dideoxynucleotide procedure (Sanger et al., 1977).

In situ hybridizations In situ hybridizations to chromosomes were done with biotinylated probes (Langer-Safer et al., 1982) and detected enzymatically with a strepavidin-

2160

biotinylated horseradish peroxidase conjugate as described by the supplier (Enzo Biochemicals). Embryonic sections in paraffin and 35S-labeled riboprobes of pG2.7 were prepared as described in Ingham et al. (1985). Hybridization and washing of sections were done as described in Chadwick and McGinnis (1987) based on Hafen et al. (1983) and Akam (1983).

Acknowledgements We are grateful to Nadine McGinnis and Adrian Erlebacher for help with chromosome in situs, and to Brad Johnson for arduous cDNA searches. Mike Kuziora generously allowed us to use his homeobox comparison program, HBCOMP. We would like to thank Vahid Yagmai, Dan Lindsley, Mary Beth Davis and R.J.MacIntyre for providing translocation stocks, and Brad Jones, Manuel Utset and Lenny Bogorad for the generous sharing of technical and analytical skills. Thanks also to Markus Noll for communicating unpublished bicoid sequences. M.B. was supported by the Medical Scientist Training Program of the NIH, National Research Award GM07205 from the Institute of General Medical Sciences. This research was supported by grants to W.M. from the Searle Scholar Program, the Presidential Young Investigator Program of the NSF, the Camille and Henry Dreyfus Fund, and the Mathers Foundation.

References Akam, M. E. (1983) EMBO J., 2, 2075 - 2084. Astell,C.R., Ahlstrom-Jonosson,L., Smith,M., Tatchell,K., Nasmyth,K. and Hall,B.D. (1981) Cell, 27, 15-23. Awgulewitsch,A., Utset,M.F., Hart,C.P. and Ruddle,F.H. (1986) Nature, 320, 328-335. Baumgartner,S., Bopp,D., Burri,M. and Noll,M. (1987) Genes Devel., 1, 1247-1267. Bopp,D., Burri,M., Baumgartner,S., Frigerio,G. and Noll,M. (1986) Cell, 47, 1033-1040. Campos-Ortega,J.A. and Hartenstein,V. (1985) The Embryonic Development of Drosophila melanogaster. Springer-Verlag, Berlin. Carrasco,A.E., McGinnis,W., Gehring,W.J. and DeRobertis,E.M. (1984) Cell, 37, 409-414. Chadwick,R. and McGinnis,W. (1987) EMBO J., 6, 779-789. Colberg-Poley,A.M., Voss,S.D., Chaowdhury,K., Stewart,C.L., Wagner, E.F. and Gruss,P. (1985) Cell, 43, 39-45. Coleman,K.G., Poole,S.J., Weir,M.P., Soeller,W.C. and Kornberg,T. (1987) Genes Devel., 1, 19-28. Dolecki,G.J., Wannakrairoj,S., Lum,R., Wang,G., Rilet,H.D., Carlos,R., Wang,A. and Humphreys,T. (1986) EMBO J., 5, 925-930. Doyle,H.J., Harding,K., Hoey,T. and Levine,M. (1986) Nature, 323, 76-79. Duboule,D., Baron,A., Mahl,P. and Gailliot,B. (1986) EMBO J., 5, 1973-1980. Fjose,A., McGinnis,W. and Gehring,W.J. (1985) Nature, 313, 284-289. Frasch,M., Hoey,T., Rushlow,C., Doyle,H. and Levine,M. (1987) EMBO J., 6, 749-759. Frigerio,G., Burri,M., Bopp,D., Baumgartner,S. and NoHl,M. (1986) Cell, 47, 735-746. Frohnhofer,H.G. and Nusslein-Volhard,C. (1986) Nature, 324, 120-125. Fyrberg,E.A., Mahaffrey,J.W., Bond,B.J. and Davidson,N. (1983) Cell, 33, 115-123. Hafen,E., Levine,M., Garber,R.L. and Gehring,W.J. (1983) EMBO J., 2, 617-623. Hart,C.P., Fainsod,A. and Ruddle,F.H. (1987) Genomics, 1, 182-195. Harvey,R.P., Tabin,C.J. and Melton,D.A. (1986) EMBO J., 5, 1237-1244. Hauser,C.A., Joyner,A.L., Klein,R.D., Learned,T.K., Martin,G.R. and Tjian,R. (1985) Cell, 43, 19-28. Hoey,T., Doyle,H.J., Harding,K., Wedeen,C. and Levine,M. (1986) Proc. Natl. Acad. Sci. USA, 83, 4809-4813. Ingham,P.W., Howard,K.R. and Ish-Horowicz,D. (1985) Nature, 318, 439-445. Joyner,A.L. and Martin,G.R. (1987) Genes Devel., 1, 29-38. Karch,F., Weiffenbach,B., Bender,W., Peifer,M., Duncan,I., Celneken,S., Crosby,M. and Lewis,E.B. (1985) Cell, 43, 81-96. Kirby,P. and Beck,R. (1986) Zool. J. finn. Soc., 86, 185-196. Kornberg,T. (1981) Proc. Natl. Acad. Sci. USA, 78, 1095-1099. Kotarski,M.A., Picket,S. and MacIntyre,R.J. (1983) Genetics, 105, 371-386. Kozak,M. (1986) Cell, 44, 283-292. Krumlauf,R., Holland,P.W.H., McVey,J.H. and Hogan,B.L.M. (1987) Development, 99, 603-618.

A novel, tissue-specific, Drosophila homeobox gene

Kuroiwa,A., Kloter,U., Baumgartner,P. and Gehring,W.J. (1985) EMBO J., 4. 3757-3764. Langer-Safer,P.R., Levine,M. and Ward,D.C. (1982) Proc. Natl. Acad. Sci. USA, 79, 4381-4385. Laughon,A. and Scott,M.P. (1984) Nature, 310, 25-31. LeCalvez,J. (1984) Bull. Biol. Fasc., 2-3, 7-17. Levine,M., Harding,K., Wedeen,C., Doyle,H., Hoey,T. and Radomska,H. (1985) Cold Spring Harbor Quant. Symp. Biol., 50, 209-222. Lewis,E.B. (1978) Nature, 276, 565-570. Lewis,R.A., Wakimoto,B.T., Denell,R.E. and Kaufman,T.C. (1980) Genetics, 95, 383-397. Lindsley,D.L. and Sandler,L. et al. (1972) Genetics, 71, 157-184. MacDonald,P.M. and Struhl,G. (1986) Nature, 324, 537-545. MacDonald,P.M., Ingham,P. and Struhl,G. (1986) Cell, 47, 721-734. MacKay,V.L. and Manney,T.R. (1974a) Genetics, 76, 255-271. MacKay,V.L. and Manney,T.R. (1974b) Genetics, 76, 273-288. Mavilio,F., Simeone,A., Giampaolo,A., Faiella,A., Zappavigna,V., Acampora,D., Poiana,G., Russo,G., Peschle,C. and Boncinelli,E. (1986) Nature, 324, 664-667. McGinnis,W., Levine,M., Hafen,E., Kuroiwa,A. and Gehring,W.J. (1984a) Nature, 308, 428-433. McGinnis,W., Garber,R.L., Wirz,J., Kuroiwa,A. and Gehring,W.J. (1984b) Cell, 37, 403-408. McKay,D.B. and Steitz,T.A. (1981) Nature, 290, 744-749. Merrill,V.K.L., Turner,F.R. and Kaufman,T.C. (1987) Devel. Biol., 122, 379-395. Messing,J. (1983) Methods Enzymol., 101, 20-77. Muller,M.M., Carrasco,A.E. and DeRobertis,E.M. (1984) Cell, 39, 157-162. Mlodzik,M., Fjose,A. and Gehring,W.J. (1985) EMBO J., 4, 2961-2969. Morata,G. and Lawrence,P.A. (1975) Nature, 255, 614-617. Nusslein-Volhard,C. and Wieschaus,E. (1980) Nature, 287, 795-801. Ohlendorf,D.H., Anderson,W.F. and Matthews,B.W. (1983) J. Mol. Evol., 19, 109-114. Poole,S.J., Kauvar,L.M., Drees,B. and Kornberg,T. (1985) Cell, 40, 37-43. Regulski,M., Harding,K., Kostriken,R., Karch,F., Levine,M. and McGinnis,W. (1985) Cell, 43, 71-80. Regulski,M., McGinnis,N., Chadwick,R. and McGinnis,W. (1987) EMBO J., 6, 767-777. Rushlow,C., Doyle,H., Hoey,T. and Levine,M. (1987) Genes Devel., 1, 1268-1279. Sanchez-Herrero,E., Vernos,I., Marco,R. and Morata,G. (1985) Nature, 313, 108-113. Sanger,F., Nicklen,S. and Coulson,A.R. (1977) Proc. Natl. Acad. Sci. USA, 74, 5463-5467. Schneuwly,S., Kuroiwa,A., Baumgartner,P. and Gehring,W. (1986) EMBO J., 5, 733-739. Scott,M.P. and Weiner,A. (1984) Proc. Natl. Acad. Sci. USA, 81, 4115-4119. Shepherd,J.C.W., McGinnis,W., Carrasco,A.E., DeRobertis,E.M. and Gehring,W.J. (1984) Nature, 310, 70-71. Simeone,A., Mavilio,F., Acampora,D., Giampaolo,A., Faiella,A., Zappavigna,V., d'Esposito,M., Pannese,M., Russo,G., Boncinelli,E. and Peschle,C. (1987) Proc. Natl. Acad. Sci. USA, 84, 4914-4918. Wakimoto,B.T., Turner,F.R. and Kaufman,T.C. (1984) Devel. Biol., 102, 147-172. Weinzeierl,R., Axton,J.M., Ghysen,A. and Akam,M. (1987) Genes Devel., 1, 386-397. Wharton,K.A., Yedvobnick,B., Finnerty,V.G. and Artavanis-Tsakonas,S. (1985) Cell, 40, 55-62. Received on March 29, 1988

Note added in proof These sequence data will appear in the EMBL/GenBank/DDBJ Nucleotide Sequence Databases under the accession number Y00843.

2161