A Homeotic Gene Cluster Patterns the ... - ScienceDirect.com

9 downloads 0 Views 9MB Size Report
A Homeotic Gene Cluster Patterns the Anteroposterior Body Axis of C. elegans. Bruce 8. Wang,' Mkhaei M. Miiiier-immergluck;. Judlth Austin; Naomi Tamar ...
Cell, Vol. 74, 29-42,

July 16, 1993, Copyright

0

1993 by Cell Press

A Homeotic Gene Cluster Patterns the Anteroposterior Body Axis of C. elegans Bruce 8. Wang,’ Mkhaei M. Miiiier-immergluck; Judlth Austin; Naomi Tamar Boblnson,’ Andrew Chisholm,t* and Cynthia Kenyon’ *Department of Biochemistry and Biophysics University of California, San Francisco San Francisco, California 941489554 tMedical Research Council Laboratory of Molecular Biology Cambridge CB2 20H England

Summary in Insects and vertebrates, clusters of AnfennapedCa class homeobox (HOM-C) genes specify anteroposterlor body pattern. The nematode C. eiegana also contains a small cluster of HOM-C genes, one of which has been shown to specify posltional Identity. Here we show that two additlonsi C. eiegans HOM-C genes also specify posltlonai Identity and that together these three HOM-C genes function siong the anteroposterlor axis In the same order as their homoiogs In other organlsms. Thus, HOM-C-based pattern formation has been conserved In nematodes despite the many differences In morphology and embryology that distinguish them from other phyla. Each C. eiegans HOM-C gene is responsible for a distinct body region; however, where their domains overlap, two HOM-C genes can act together to specify the fates of Individual cells. introduction In spite of their extensive morphological and developmental differences, insects and vertebrates use a consewed system for specifying positional identity along their anteroposterior body axes. In both types of organisms, members of Antennapecfia (An@) class homeobox gene complexes, known as HOM-C or Hox genes, are expressed in a position-specific fashion where they function to specify ceil fates (reviewed by McGinnis and Krumlauf, 1992). The order of HOM-C genes in the genome has been conserved during evolution, as has the coiinearity between their positions in the genome and their domains of function in the animal. The fact that such similar HOM-C genes generate body pattern in animals as different as insects and vertebrates raises the question of whether other metazoans with still different morphoiogies and modes of embryogenesis also use HOM-C genes for pattern formation. The nematode Caenorhabditis elegans differs from members of arthropod and vertebrate phyla in many ways,

*Present address: Department of Biology, Massachusetts Technology, Cambridge, Massachusetts 02139.

Institute

of

including its asymmetric embryonic cleavage pattern, its invariant ceil lineage, and its nonsegmented body. Nevertheless, like other types of organisms, C. elegans does generate many position-specific structures along its anteroposterior axis. For example, in the hermaphrodite, the vulva and associated neuromuscular systems arise in the central body region. In the male, a complex copulatory apparatus arises in the posterior body region and the tail. In addition, a number of neural and mesodermal specializations are generated in specific locations along the anteroposterior axis in both sexes. C. eiegans contains a large number of homeobox genes (Biirglin et al., 1989) of which four are most similar to members of the Antp class. These four homeobox genes, the HOM-C genes, are located in the same region of the genome. Previous analysis of the homeoboxes suggested that the order of these genes in the C. elegans genome is the same as the order of their closest homoiogs in Drosophila (Kenyon and Wang, 1991; Biirgiin et al., 1991). One of the C. elegans HOM-C genes, the Antp homoiog mab-5, functions to generate structures and ceil types that characterize the posterior body region. These include sensory rays and male copulatory mu&s, neuronai fates, programed cell deaths, and ceil migrations in this body region (Kenyon, 1988; Costa et al., 1988). Mutations in mab-5 do not affect the development of other body regions. For example, structures that characterize the central body region, such as the vulva and vulval-associated neurons, are not affected, nor are many structures that characterize the anal body region and the tail. Here we ask whether the C. elegans HOM gene cluster is responsible for generating pattern all along the body axis or whether mab-5 is just a remnant of an ancestral patterning system. By identifying the functions of two additional HOM-C genes, we find that this system does generate long-range anteroposterior pattern. The homeobox gene to the left of m&5, a Sex combs reduced (Scr)l Deformed (Dfd)/probosc@edia @b) homoiog, is responsible for body structures and cell types that characterize the central body region. The homeobox gene to the right of ma&-5, a likely AbdominaCB (AM-B) homolog, is responsible for generating patterns that characterize the anal region and part of the tail. Both of these genes are expressed in a position-specific fashion throughout development and are expressed in cells whose development they control. We also describe an additional homeobox gene in the C. elegans cluster whose sequence and expression suggest that the ancestral HOM cluster may have also regulated head development. Together the results indicate that the conserved HOM-C gene system that generates anteroposterior pattern in insects and vertebrates also patterns the anteroposterior body axis of C. elegans. These findings indicate that nematodes as well as arthropods and vertebrates have continued to generate anteroposterior pattern in a fundamentally conserved manner since the HOM-C arose within a common ancestor. Our findings also raise the possibility that

Cdl 30

:2 + :o: .a HSN

v4

v5 33

V6 a Pl2

Y

Figure 1. Expression of the eg/-5-/acZ Fusion (A) A hermaphrodite Ll larva (O-2.5 hr) showing expression of the extrachromosomal egl-5-la@ fusion extending from the most posterior body region into the tail. Similar localized expression was observed in over 95% of animals (n > 50), although the number of cells staining was variable. The arrow points to the HSN, which migrates from the tail to the central body region during embryogenesis. The egl-WacZ fusion is strongly and consistently (in over 95% of cases; n > 25) expressed in the posterior body region and the tail of adults, especially in males. (B) Individual cells expressing eg/-5-/acZ in an Ll animal (O-2.5 hr). mu, body wall muscle (S. J. Salser and C. K., unpublished data). Cells were initially identified using a compound Nomarski microscope. The entire expression pattern of this animal was recorded and reconstructed from photographs by S. J. Salser of a series of focal planes, one of which is shown here. We identified some or all of the staining cells in approximately 20 animals. Fewer cells were observed to stain in most of these animals than in the animal shown here. Cells expressing /acZ usually (in over 90% of cases) included subsets of B, U, Y, K, F, and body muscles in the tail and posterior body region. (C) Schematic diagram of cells affected by egl-5 mutations in male and hermaphrodite Ll larvae (Chisholm, 1991).

the primordial cluster may have regulated pattern formation in the head as well as the body. In C. elegans, as in other organisms, HOM-C gene domains overlap to some extent. In C. elegans, it is possible to ask whether two HOM-C genes can cooperate to specify individual cell fates in these regions. We find that in a developmental decision involving cell fusion, two HOM-C genes with overlapping domains act together to specify individual cell fates. The same two genes can compensate for one another’s function in regulating the migratory behavior of a neuroblast. Thus, individual cells located where the domains of two HOM-C genes overlap can acquire new identities that depend on the activities of both genes. Results A C. elegans Abd43 Homo@ Specifies Cell Fates in the Anal Body Region and the Tail Many specialized structures and cell types are generated by cells located near the anus of the animal. For example, the blast ceils 6, Y, U, and F, which line the rectum, gener-

ate male reproductive structures such as the spicules and the proctodeum (Figure 1C and see Figure 10). A number of specialized neurons such as the interneuron PVC and the egg-laying neuron HSN are generated in this region. One gene, egl-5, seemed like a good candidate for a HOM-C gene specific for this body region. In egl-5 mutants, many of these structures and cell types are missing, in some cases because precursor cells undergo homeotic transformations to more anterior fates (Chisholm, 1991). Furthermore, egl-5 maps just to the right of mab-5 (Chisholm, 1991), where the homeobox ceh-77 is located (Hawkins and McGhee, 1990; Schaller et al., 1990) (ceh stands for C. elegans homeobox). To learn whether the gene containing ceh-7 7 might be eg/-5, we first asked whether a cosmid containing ceh-7 7 would rescue the egg-laying defect of eg/-5 (see Experimental Procedures). We found that such a cosmid did rescue the phenotype. Next we asked whether the ceh-7 7 homeobox, which is expected to be required for ceh-77 function, was altered in any of 11 egl-5 mutants. We found mutations in three eg/-5 alleles (Figure 2). eg/-5(u202) contained a 7 bp insertion that created a stop codon at the beginning of helix 2 of the homeodomain. This mutation should eliminate DNA binding, because the resulting protein lacks homeodomain helix 3, the highly conserved DNA recognition helix that mediates DNA binding (reviewed by Gehring et al., 1990). Two other alleles, 11486 and 82495, contained the same C to T transition, replacing Arg-52 with a Cys residue. Arg-52 is a highly conserved residue within helix 3 of the homeodomain. Because three egl-5 alleles contained base changes within conserved regions of the ceh-7 7 homeobox, which, in turn, is located on a DNA fragment that rescues egl-5 mutants, we conclude that the ceh-77 homeobox gene is indeed egl-5. In addition, these results indicate that the strong phenotype previously reported for eg/-5(u202) (Chisholm, 1991) is likely to be the null phenotype. The 891-5 homeobox exhibits levels of homology comparable to those of several Antpclass homeoboxes in Drosophila. Three of these, An@ (57% identity), Ukrabithorax (Ubx; 55%), and abd-A (53%), are thought to have diverged from one another following the separation of insects and vertebrates (Akam et al., 1988). The egC5 homeobox also resembles Abd-B (53%), which is thought to have diverged from the AntplUbxlabd-A proteins prior to the divergence of insects and vertebrates (Akam et al., 1988). To understand better the relationship between eg/-5 and these other HOM-C genes, we determined the sequence of the entire egl-5 coding region (Figure 2). The inferred protein sequence consists of 223 amino acids. We found additional regions of homology to Ubx, abd-A, and A&-B downstream of the homeobox (Figure 3). Just upstream of the homeodomains of all HOM-C proteins, with the notable exception of the Abd-6 homologs, is a highly conserved hexapeptide. This motif was not present in egC5. Because the absence of this hexapeptide seems to be a specific feature of the AM-B homologs, it may be most accurate to classify eg/-5 as a member of this group.

r$ elegans

1:

Homeotic

Gene Cluster

GC

CTC

TGC

TTC

CTG

ACC

AAT

CTA

CAG

AAT

CAG

TGA

GAG

TGA

GGG

ATT

ACG

GTA

CAC

ACA

120:

GGC G

TCG S

TCG S

ACA T

GCT A

TCA S

TCA S

GCT A

GCC ACG ATSTT

160:

GAT 0

CAT H

TTA L

TCG'AGA S R

CTT L

GCC A

GCC A

ATG M

240:

TCA s

TCA s

A;T

C;A

TCC S

ACA T

GAP. E

GCC A

300:

AGC S

TAC Y

GGC G

TGG W

CCA P

CAG 0

AAC N

360:

CCC P

GGA G

TGG W

CCT P

CAG 0

TGC C

420:

T;A

T;A

A;"

A;A

GGC G

460:

l;C

CAP. 0

C;G

,;A

540:

TTG ACA GAT LTDROlKlWFONR

600:

AAG

60:

AGG

CTC

CTT

CTT

TCA

AAT

TTT

GGA

AAG

AAC ACT NTS

CGC

TCG

ACG T

AGT S

GCA A

TTC F

G;T

T;T

TCG

ACA

ACA

TCT S

TCA S

CAA 0

CCA P

GAT 0

GCC A

AAT N

ACG T

CAA 0

CGA G

GTT "

GGA G

AAA K

GAA E

GAT D

CCA P

G;A

ACA T

AGT S

CTT L

T$C

CCA P

GGT G

A?C I

TCG S

GCT A

GCG A

TAC Y

A;"

CAG 0

T;C

AAC N

TIC Y

T;T

GGT G

CAA 0

CCA P

TTG L

GGC G

CCT P

GCC A

ACG T

TTT F

TAT Y

CCG P

AAT N

ACG T

GCG A

TGG W

CCC P

AAC N

T$C

GGT G

G;"

TTG L

T;C

&A A

C;T

CAA 0

A;G

TAT Y

CAA 0

PICA TS

TCA

GTT V

CTT L

GAA E

G:G

AAA K

TCA S

T;C

G;G

TCA S

A;G

A;A

C;

C;T

G:A

G;G

CTC

C;C

CTG L

CAG 0

A;1

CAA 0

CGT

CAP.

ATC

AAA

ATC

TGG

TTC

CAA

AAT

? GT

C;A

ATG L M

AAG K

GCG A

AAA K

A;"

GAG E

CAA

A;A

GTA V

GAT 0

GAT 0

CAC H

A;G

GAP. E

CAT H

AFT

$A

CTT L

CTA L

C;G

":A

AAT N

CCA P

C;A

PAP K

L-0

660:

G;A

ATG n

G;A

A;"

G;,

A;,

G;T

G;,

G;G

A;A

A;A

T;G

CAA 0

A;,

G:T

C;T

T;G

C;A

$A

GCA A

720:

G:T

GCA A

CAC H

AAT N

C;A

TAT Y

CAA 0

TAC Y

CCG P

TTG L

'El

$A

CCG P

TAA . . .

TCT

GTG

AAA

ATT

GAG

AGT

760: 640:

GTT TTT

AAG CAC

AAA TAT

TTT TTA

AAT GGA

AAT TTT

TAT AGG

TTT ATT

CGA ATA

AGA TTT

TAT GAT TAC--poly

TGA TGT A

AAA

TAT

AGT

TGA

AAA

TAG

Hexapeptide

HomeOdomain :

HOxb-1fHox

AntP

:PSPLJZ!@BSQFGI(C

2.91

:TPRTFD--

lin-39/ceh-15 Hoxa-2(Hoxl.ll)

Figure 2. Nucleotide and Predicted Amino Acid Sequences of e9M Positions of introns are indicated by arrowheads. The box indicates the most likely translation start site, as inferred from other C. elegans start sites (M. Perry, G. Hertz, and W. Wood, personal communication). The homeodomain is underlined. Open circles indicate mutations found in e@ mutants. After position 512, a 7 bp insertion (AAAACAA) was found in egM(u202). This insertion results in a stop codon immediately downstream. At position 576, a C to T transition was found in alleles n486 and e2495. This change results in an Arg to Cys substitution. In the s@-5-/ecZ fusion, the /acZ gene was inserted at the Apal site at position 344. We are not certain that the 5’ end of the cDNA is full length. It is likely, however, that this sequence includes the beginning of the protein, because there is an in-frame stop codon 21 bp upstream of the first AUG. and there is no putative splice acceptor in between. Three asterisks denote the stop codon TAA.

‘?

2P

QE RXRGRQTYTR YQTLFaLElreF

3? llFNRYLTRRR

‘P RIEIASALCL

7 TERQIKIWFQ

“I m

KTKGEPGS

:GGAV----TRVHST :QP-E----KEFXAAKKTA

lin-39/ceh-15

:GGa"----

__ ____ ____-_____ ---------_-_______--------___-----------------

GPS

"bx

:N"TF----AIAGECPEDP

egl-S/ceh-11 Iloxb-7(Hox 2.3)

: none :NFRI------SG

abd-A egl-S/ceh-11 Hoxb-#(Hex 2.4)

:DLPR----TLTDW"GSPF

:

none

f;

S--T Gil--

GL -R-m

---

A5 S-K-___

---

-Q-

pD

i

i;i;#Q-

s-

:-TQ-F----P-AA

sIIIIIIIII

$RIIlr"-

_ ____-_-_--

----"----__--_---__

-----H---A ---------s

---N------

---___-___

-----L---T

-D--w----__-_______

-----A---l ---______-

-Q--SET-H-

____-__---

-T-"------

-m-S"--=________--

QpSS-“S~Q

+LIRcpTQ-

-y--------

------T---

-w-S"--=-

QQ,,-""It@

_-_-----__

I,--p----

-GE-g -LLS

L

--s-p

-E--Q-

-I,--------

-----A---K

za

_ ---"S---G-

----"-----

----------

NI(D*-s-

-~-S~ F--------- L--A

-cp--K --IR---&F--” -e-Q-

---S”--AI(-

QQSS

Figure 3. Similarities between Nematode, Fly, and Mouse HOM-C Genes The homeodomains, adjacent residues, and hexapeptide regions of the C. elegans HOM-C genes are compared with those Drosophila HOM-C genes and mouse paralogs that exhibit the highest overall amino acid identity within these regions. In the case of egl-5/c&77, Hoxd-77 exhibited the highest level of identity within the homeodomain itself; when adjacent regions were considered, several mouse AM-S-class paratogs had similar levels of identity. Dashes indicate identity with An@ Identical residues between the nematode and fly or mouse genes are boxed. Highlighted residues Gin-6 and Thr-7 in Anfp, Ubx, and a&I-A homologs appear to be common to only these classes. The precise distance between the hexapeptide regions and the homeodomains is not conserved. egl-5 does contain the residues YP upstream of the homeodomain. It is possible that this is a divergent hexapeptide sequence. Recently, ceh-13 was found to contain a hexapeptide region that is most similar to the lab homologs (C. Wittmann and H. Tobler, personal communication). The sequences (renamed according to Scott, 1992) have been tabulated from the following references: ceh-73, Schaller et al., 199% m&5, Costa et al., 1999 lab, Diederich et al., 1999; pb, Cribbs et al., 1992; Dfd, Regulski et al., 1997; Scr, LeMotte et al., 1999; Antp, Laughon et al., 1996; Ubx, Komfeld et al., 1999; abd-A, Karch et al., 199rJ; AM-B, Celniker et al., 1999; Ho&f, Frohman et al., 1990; Hoxe-4. Galliot et al., 1999; Howe-l, Tan et al., 1992; Howe-5, Fibi et al., 1939; Hoxb-6, Schughart et al., 1998: Hoxb-7, Meijlink et al., 1937; Hoxb-8, Kongsuwan et al., 1939; and /fox&ll, Izpistia-Gelmonte et al., 1991.

Cell 32

ceh-13

lin-39

mab-5

lin-39

egl-5

Figure 4. Embryonic Expression Pattern of the /h-39-/acZ, maLM-lacZ, and eg/-5-/acZ Fusion Genes The schematic drawing shows a mid-stage embryo (-460 min after first cleavage) with the anterior (A) and posterior (P) ends indicated. Below are shown embryos that carry the extrachromosomal /in-39-/acZ (lin-39) or egl-5-/acZ (egl-5) constructs. Expression of an integrated mab-WacZ(mab-5) fusion, which has been described previously (Salser and Kenyon, 1992; Cowing and Kenyon, 1992) is shown for comparison. The majority of embryos (85%; n = 34, /in-39; n = 20, egl-5) exhibited the expression patterns shown or stained with slightly lower intensity. The remaining 15% of embryos showed the above localized staining with slightly extended domains.

egl-5

mab-5

egl-5 Is Expressed in a Position-Specific Manner and Is Present in Cells That Require Its Function A characteristic feature of HOM-C genes in other organisms is their expression in restricted spatial domains. To learn whether the egl-5 gene was likely to be expressed in a position-specific fashion, we constructed a ceh-7 l/acZ fusion gene (see Experimental Procedures). In two independent transgenic lines, embryos (Figure 4) larvae (see Figure l), and adult8 (data not shown) expressed f3-galactosidase in the posterior body region near the anus and in the tail.

We examined larvae at stage Ll to investigate whether the gene was expressed in ceils known to require eg/-5 function to differentiate correctly during that time (see Figure 1 C). The number of cells expressing the fusion varied from animal to animal, possibly because the fusion was extrachromosomal and lost during mitosis. However, we often saw expression in the rectal blast cell8 B, Y, U, F, and K. One cell in the central body region also expressed /a@ this was the HSN, which differentiates in the tail region in an egl-bdependent fashion and then migrates anteriorly late in embryogenesis.

I:

TCT S

CGT R

CCT P

CAT H

TTA L

TTA L

TAT Y

CAC H

CTC L

GTT V

TCC S

CCT P

CCC P

TCC S

CCT P

CAT H

61:

TTT F

CAA (I

TCA S

ATT I

CTT L

ATC I

AAT N

CAC H

TTT F

CAC H

TCG S

TTT F

TCA S

CTC L

CAG 0

q

121:

TCA S

CCG P

TCA S

TCC S

ACA T

GAT D

GCA A

CCG P

AGA R

GCT A

ACA T

GCT A

CCT P

GAA E

161:

ICC TCA TCT TCT SSSSSSSSTSSVGASGIPSS

TCA

TCA

TCA

TCC

ACA

TCT

TCT

GTG

GGT

241:

TCT S

TTC F

ATC I

TCT S

ATG

ACC ACA TTST

TCA

PICA

TCA S

AGC S

TCT S

TCG S

TCT S

TCA S

GCA

TCT

GGA

ATT

CCA

TCA

TCT

AGT S

ACA T

ATT IG

GGA

TAT Y

GAT D

CCA P

q

ATG

ACA GCG TASAALSA

TCT

GCT

GCA

CTT

TCT

GCT

301:

CAT TTT GGA AGT HFGSYYOPTSSSOIASYFAS

TAT

TAT

GAT

CCG

ACT

AGT

TCT

TCT

CIA

ATT

GCT

TCA

TAT

TTT

GCC

TCA

361:

AGT S

CAA 0

GGA G

CTG L

GGA G

GGT G

CCT P

CAA 0

TAT Y

CCA P

ATA I

CTC L

GGA G

GAT D

CAG'TCA 0 S

CTA L

TGC C

TAT Y

AAT N

421.

CCA P

TCA S

GTA V

ACA T

AGT S

ACC T

CAT H

CAC H

GAC D

TGG W

AAG K

CAC H

CTG L

GAP, E

GGA G

GAC D

GAT D

GAT 0

GAT D

GAT D

461:

AAG K

GAT D

GAT D

GAC 0

AAG K

AAA K

GGC G

ATC IS

AGT

GGT G

GAT 0

GAC 0

GAT 0

GAT D

ATG M

GAT D

AAG K

AAT N

TCA S

GGC G

541:

GGT GCA GTG TAT CCA'TGG GAVYPWMTRVHSTTGGSRGL

ATG

ACA

CGT

GTT

CAT

TCA

ACT

ACA

G%A

GGT

TCA

CGC

GGC

GAG

601:

AAG K

CGA R

CAA 0

CGA R

ACA T

GCA A

TAC Y

ACA T

AGG R

AAT N

CAA 0

GTA V

TTA L

GAG E

CTG L

GAA E

AAG K

GAA E

TTT F

CAT H

661:

ACA T

CAC H

AAA K

TAT Y

CTG L

ACG T

AGG R

AAG K

CGT R

AGA R

ATT I

GAA E

GTA V

GCT A

CAT H

TCA S

TTG L

ATG "

CTT L

ACC T

721:

GAP, AGA CAA'GTC AAA EROVKIWFPNRRMKHKKENK

ATT

Tf?G

TTT

CAP,

AAT

CGA

CGA

ATG

AAG

CAC

AAA

AAA

GAA

AAT

AAG'

781:

GAT D

AAA K

CCA P

ATG ,,

ACA T

CCT P

CCG P

ATG ,,

ATG M

CCA P

TTT F

GGT G

GCA A

AAT N

CTA L

CCA P

TTC F

GGT G

CCA P

TTC F

641:

CGG R

TTC F

CCA P

CTT L

TTC F

AAT N

CAP. 0

TTC F

TAG ***

TTA

CTT

GCC

ATT

TTA

TAT

TTT

GAG

ATC

TTT

TTG

901: 961: 1021: 1061: 1141: 1201: 1261:

AAT ATT CTC TCG GCC CCC TCC

TTT TTT TCT AGC CCT CAA ACC

CCT CGC TTA TGA TTG ATT TGC

GAC TTT TCA ACT CAA GTG TTA

TTT ATT GCT TCT TGT ATC TTT

GAA TTC TTA TAT TTT TGA CTT

AAT TCT CCA CTA TTT TCT TTT

TAT GTG TAC TTT TTT TCC ATC

CGT TCT CCG TCC CTG AAT ATT

TAT CTC TAA TTC ATT CTT TTT

CAC ACT TAA CAC ATA CCA TTG

ACA TTT ACG ACA AAA ATT TAA

TTT CCA AAT AAC GTT TTT TAA

CTC CAC CCA AGA CTC ACA ATT

GCC ACA GGG GTA TGA ATT TGC

CCT CTC TTC ATG *AA CCC TTT

TCC TCC CCC CTG TCT ATT TTT

TTT CTC CGT GCA CIA TTT C--ply

CTC CTC TCC AAC AAT CTT

ATG TGA TTC TTT CTT CCA A

GAA E

TTA L

TCG S

CTT L

Figure 5. Nucleotide and Acid Sequences of /h-39

Predicted

Amino

Potential translation start sites are boxed. The positions of introns are indicated by closed arrowheads. The homeodomain is underlined. The open arrowhead indicates the position where a nonsense mutation (TGA) was made in a 10 kb DNA fragment that rescues h-39 (subclone ANco4). The 1 kb wild-type genomic fragment that was coinjected with ANco4 includes the coding sequence from position 406 to position 583. The open circle at position 740 indicates the mutation found in the /im39(mu26) allele. This mutation is a G to A transition that changes the Trp residue to a stop codon. In the h394acZ fusion, the /acZ gene was inserted at the Ncol site at position 553. This cDNA is likely to be full length or nearly full length, because a message of similar size (approximately 1.3 kb) was observed in all developmental stages by Northern analysis (data not shown), and because the smallest genomic fragment that rescues lh39(mu26) extends only about 360 bp upstream of the cDNA.

$2 elegans

Homeotic

Gene Cluster

Figure 6. Function and Expression of /in-39 in the Central Body Region (A) Summary of P cell lineages and Q cell migrations in wild type (S&ton and Horvitz, 1977) and /in-39 mutants. Diamonds and arrowheads denote cell fates dependent on HOMC genes. The closed diamonds denote /i&W-dependent cell fates. The open arrowheads point to cell deaths(x) that require mab-5 activity (Kenyon, 1966). The closed arrowhead points to an epidermal cell division dependent on eg/-5 activity (Chisholm, 1991). Lineages of overlapping subsets of P cells were followed in several /in-39(mu26) Ll hermaphrodites. The ventral hypodermis was disorganized in the vicinity of P(3-6). Our lineage data was consistent with previous analysis of the weaker allele n709 (Ellis, 1965) except that in h39(mu26) animals, the Pn.p cells did not divide. The Cl migrations on the left are indicated by a solid arrow, the 0 migrations on the right by a dashed arrow. V, vulva1 precursor cell; N, VC neuron. (Band C) Expression of an integrated /in-39-/acZfusion in an Ll larva. The two panels show the right (B) and left (C) sides of an Ll that is 4 hr old. At 4-6 hr after hatching, expression in Ll animals was seen in the juvenile ventral cord neurons (60%, n = 197; see unmarked cells in ventral cord of animal shown), in 0.a and Q.p (R, 51%, n = 101; L, 4046, n a 90) in P(3/4) (R, lo%, n = 107; L, 7%, n P 90) in P(5/6) (R, 64%, n = 107; L, 73%, n t 90), in P(7/6) (R, 96%, n = 107; L, 89%, n = 90) and in an unidentified cell near the pharynx (39%, n = 197; visible on left side of animal shown). Expression was also seen in 0 itself (R, IQ%, n = 59; L, 40%, n = 56). Expression was also occasionally seen in W-V5 (under 5%). Staining was present in descendants of the P and Q cells; however, it is not clear whether

The C. eiegans DfdlScrlpb Homolog Is the Gene lin-39 In a search for Antpclass homeoboxes, we previously identified a HOM-C gene that seemed likely to be specific for the central body region (Kamb et al., 1989). The homeobox of this gene, ceh-75, most closely resembles Drosophila Scr (77%) Dfd (770/b), and pb (73%). Because these three genes specify pattern just anterior to the domain of Antpin the fly, it seemed pcssiblethatceh75might specify pattern just anterior to the domain of the Antp homolog, mabd, in the worm. We hypothesized that mutations in the gene containing ceh-75 would affect the development of the central body region. In addition, the gene should map in the vicinity of the HOM cluster. Mutations in one gene, /in-39, met these criteria. /in-39 was first identified genetically by Ellis and Horvitz (Horvitz et al., 1982; Ellis, 1985). They found that in /in-39(n709), cells located in the central body region that normally became VC neurons, which are putative egglaying neurons, instead underwent programed cell death. In addition, vulva1 precursor cells often failed to divide (Ellis, 1985). In stronger alleles isolated by S. G. Clark and H. Ft. Horvitz (personal communication) and by us, these cells consistently fail to divide (data not shown). In addition, /in-39 is located on chromosome Ill near mab-5 (Wood, 1988; S. G. Clark and H. R. Horvitz, personal communication), making it the best known candidate for a HOM-C gene specific for the central body region. To learn whether /in-39 might correspond to ceh-75, we first asked whether a cosmid containing ceh-75 would rescue the vulva-less (Vul) phenotype of our /in-39(mu26) allele. We found that such a cosmid did rescue the phenotype, as did a 10 kb subclone containing the homeobox (see Experimental Procedures). To determine whether rescuing activity required ceh-75, we introduced a nonsense mutation in the predicted hexapeptide motif just upstream of the homeodomain (Figure 5) and found that rescuing activity was lost. Rescuing activity could be restored by coinjecting a 1 kb wild-type DNA fragment that overlapped the mutation but contained only part of the coding sequence, presumably as a result of homologous recombination in vivo. This control argued that it was the nonsense mutation in the ceh-75 coding sequence that destroyed rescuing activity and not some other mutation introduced inadvertently. In addition, we sequenced the ceh-75 homeobox of /in-39(mu26) and found that it contained a G to A transition converting an absolutely conserved Trp residue within the recognition helix (Scott et al., 1989) to a nonsense mutation (Figure 5). This mutation should eliminate DNA binding activity. Together these observations led us to conclude that the gene containing the ceh-75 homeobox is M-39.

this presence reflected 6galactosidase activity. pressed the fusion in the In adult hermaphrodites, muscles (data not shown).

continued transcription or perdurance of Older larvae and adults conskitently excentral body region (in over 95%, n > 50). the fusion was often expressed in the vulva1

Cell 34

+

hermaphrodites

lin-39 -

1 1

I ..- -.. 000000 mars

hermaphmdiies

Wildtype

Iin.-39-

mab l-5+

Summary tin-39

Figure

7. Fusion

Pattern

of fusion

pattern

1 OFF 1 ON / ON 1OFF

of Pn.p Cells

(A) Summary of the Pn.p fusion pattern in mates and hermaphrodites. The domains where /in-39 and m&5 influence the fusion decision are indicated by bold lines. In hermaphrodites, meb-5 does not influence the fusion decision (Kenyon, 1988). Unfused Pn.p cells are represented by open circles and fused Pn.p cells by dashed lines. (B) In situ staining of late Lllearly L2 larvae with the MH27 antibody. Only P(3-1 l).p cells are labeled; P12.p adopts a different fate. The anus, which is always in the same focal plane as the Pn.p cells, is indicated by an arrowhead. A /in-39(nl780); MG(e7490); @y-27&78) strain was used to distinguish males (which are normal length) from hermaphrodites (which are dumpy). In /in-39@1760); hirrM(e7498), the pattern seen in both sexes was highly consistent (n > 20). In the /in-39@1760) maM(e1239); Mm-5(e7490) triple mutant, the frequency of males was as expected for him5(e7490). In 18 of 18 cases, no unfused Pn.p cells were seen. In 2 of 18 cases, one remaining Pn.p cell appeared by its small size to be in the process of fusing. The fusion pattern in wild type and m&5 mutants was described previously (Kenyon, 1988).

A8 mentioned above, the h-39 homeobox most closely resembles those of Dfd, Scr, and pb. To learn whether h-39 might have additional similarities to any proteins outside the homeodomain, we determined the /in-39 coding sequence from cDNA and genomic clones (see Experimental Procedures). The inferred Lin-39 protein sequence contained 253 amino acids (Figure 5). The protein exhib ited additional sequence similarity to the same proteins, Dfd, Scr, and pb, just downstream of the homeodomain and within the conserved hexapeptide sequence located just upstream (see Figure 3). The /in-39 Gene Gives the Central Body Region Its Identity To learn whether h-39 functions specifically to generate cell specialization within the central body region, we characterized the mutant phenotype more extensively. In the wild type, the vulva and the VC neurons mentioned above are produced by centrally located members of a row of ventral epidermal precursor cells, P(l-12). Soon after hatching, each P cell undergoes a stereotypical lineage to generate one epidermal cell (the Pn.p cell) and a neuro-

blast (the Pn.a cell). The P(3-8).p cells become members of the vulva1 equivalence group and can be induced by the gonadal anchor cell to generate vulva1 cell types. The P(3-8).aap cell8 become the VC neurons. In other lineages, P(1,2,9-1 l), the Pn.p cells fuse with the epidermal syncytium soon after their birth, and the Pn.aap cells undergo programed cell death (Figure 8A). If the role of h-39 is to allow centrally located cells to adopt specialized fates, then in in-39 mutants, central cells should adopt the fates of homologous cells located elsewhere. This is the case for the VC neurons, as described above. In addition, the P(3-8).p cells would be expected to fuse with the epidermal syncytium, as do their homolog8 in other body regions. To investigate whether this was the case, we monitored cell fusion with an antibody that recognizes a component of cell junctions. As cell shown in Figure 7, all of the Pn.p cells underwent fusion in h-39 hermaphrodites. Thus, the role of h-39 in the P(3-8).p cells is to allow them to develop differently from their homologs. In the male, the centrally located P(3-8).aap cells each divide to generate two neurons, one of which (the Pn.aapp

r$ eiegans

Homeotic

Gene Cluster

Figure 8. 0 Lineage and Migrations in WildType and Mutant Animals (A) Cl lineage showing time of cell divisions in key to migrations wild-type animals at 20%. The timing of cell ir, divisions was sometimes abnormal in lin-39 Q itself : and in /in-39 mab-5 mutants. Closed circle, poQ.a 8 Qap : d~ sition where Q divided; large open triangle, position where Q.a divided; small open triangle, 0.p & descendants : final position of Qap; large stippled square, q position where Qp divided; split line, position where Qpa divided; small stippled box, final B. Migrations of QR and its descendants positions of Q.paa and Q.pap; x, cell death. --7 (B and C) Q migrations in wild-type and mutant _I 0 wild type animals. The epidermal cells Vl-V6 are la< beled as references for position along the body axis. OR, Q neuroblast on the right; QL, Q neux0 roblast on the left. me&5 affects primarily the O-?&L migrations of QL descendants, whereas lin-39 affects primarily the migrations of QR and its descendants. The double mutant affects migrations on both the right and left sides. Migrations in wild-type and mab-5 animals are shown h-39 mab-5 for comparison (Salser and Kenyon, 1992). (8) Migration of QR and its descendants. The vs posterior anterior Vi v2 V3 v4 v5 third diagram shows the actual pattern of cell migrations in 1 of 5 h39(mu26) animals observed. The final positions of the Q descenC. Migrations of QL and its descendants dants in additional animals were determined after the migrations had been completed. The position of QR.ap was variable (as shown by the dashed line); sometimes it remained posterior to its wild-type position (5 out of 14). QR.paa and QR.pap were usually located (14 out of 15) in the region spanned by the epidermal cells V3 and V4. The final positions of Qap, Q.paa, and Qpap show the same distribution in lin39(mu26) and /in-39(nl760) animals (n = 15, mu26; n = 10, nl760). The lower diagram shows cell migrations in I of 2 Iirr-39@7 760) anterior vs posterior maM(el239); him-5(e7490) animals obsewed. In 1 animal, Q.ap migrated anteriorly and posteriorly several times between V3 and V4 before stopping near V3. The position of Q.ap was variable: sometimes Q.ap was located posterior to its wild-type position (11 out of 64). Q.paa and Q.pap were usually located in the region (54 out of 64) spanned by the epidermal cells V3 and V4. (C) The third diagram shows cell migrations in 1 of 4 /h39(mu26) animals observed. In 12 additional /in-39(n1760) animals, the migration of QL was followed until its first division. In the majority of n7760 animals, QL migrated as in wild type (6 out of 12). In 2 animals, QL migrated to V5 and then returned to its birth position before dividing. In 2 other animals, QL migrated posteriorly past V5 and then turned to migrate anteriorly again, dividing just dorsal to V5. (Qccasionafly [1 out of 151, a similar QL migratory behavior was seen in m&5 mutants. In addition, in 4 out of 15 m&5 animals, QL migrated past V5 before dividing and then continued posteriorly; data from S. J. Salser.) The fourth diagram shows cell migrations in 1 of 3 /in-39@7760) mab5(e7239); himG(e7490) animals observed. Thirteen additional migrations of just QL were also observed. In 15 out of 16 animals, QL migrated posteriorly past V5. In the majority of cases (14 out of 16) QL then migrated anteriorly back over V5. Twice, QL migrated back and forth several times before dividing. The division of QL usually occurred over V5 (12 out of 16). although a few occurred either posterior (1 out of 16) or anterior (3 out of 16) to V5. The position of Q.ap was variable: occasionally Q.ap was found posterior to its wild-type position (9 out of 60). Q.paa and Q.pap were usually located in the region spanned by V3 and V4 (55 out of 60).

A. Q lineage

cell) is a serotonergic neuron that regulates mating behavior (Loer and Kenyon, 1993). In M-39 males, these serotonergic neurons were not present, as assayed by immunolabeling with an anti-serotonin antibody (Loer and Kenyon, 1993). A more detailed analysis of the h-39 male will be published elsewhere (S. J. Salser and C. K., unpublished data; Clark et al., 1993 [this issue of M/j). /in-39 affects cell migration as well as the fates of stationary cells. We isolated the h39(nw26) allele in an ongoing screen for Q neuroblast migration mutants; misplacement of these cells in h-39 mutants had also been observed by C. Bargmann, S. G. Clark, and H. R. Horvitz (personal communication). The Q neuroblasts are bilateral homo-

logs born just posterior to the central body region. These two cells migrate in opposite directions. The 0 neuroblast on the right (QR) and its descendants migrate anteriorly through the central body region, whereas the Q neuroblast on the left (QL) and its descendants normally migrate posteriorly. In /in-39 animals, the migrations of QR and its descendants were foreshortened (Figure 8). The migrations of QL and its descendants were usually unaffected, although QL itself occasionally migrated abnormally (Figure 8). A conserved feature of HOM-C genes is their positional specificity. To learn whether or not the function of In-39 was position specific, we examined cells and structures

Cell 36

located elsewhere in /in-39(mu26) animals. Male mating structures, which develop in the posterior body region and tail, including the rays, hook, diagonal muscles, and spicules, appeared wild type (n = 20). Other posterior structures, the postdeirid (n = 11) and M-derived coelomocyte (n = 1 l), were also normal. The excretory pore (n = 10) and BDU (n = 1 l), located in the anterior, appeared normal. Serotonergic neurons in the head and tail were not affected (Loer and Kenyon, 1993). Thus, in h-39 mutants, structures normally generated in the central body region during postembryonic development are missing, while structures that characterize other body regions appear to be normal. Together these findings argue that the wildtype role of In-39 is to give the central body region its identity. lin-39 is Expressed in the Central Body Region in Ceils Known to Require its Function To inquire whether h-39 was expressed in a positionspecific fashion, we constructed a /in-39-IacZ fusion containing a greater amount of upstream DNA than was required for rescuing activity, and we obtained two independent transgenic lines (see Experimental Procedures). In embryos, larvae, and adults, the fusion was expressed in cells located in the central body region (see Figures 4 and 6; data not shown). Because the development of the central P(3-8) cells, as well as the Q cell migrations, are altered in h-39 mutants, we asked whether these cells expressed the /in-39-/acZ fusion during the Ll , when they begin to express position-specific fates. We found that they did (see Figure 6). The fusion was expressed consistently in P(5-8) and to a lesser extent in P(3,4). We did not observe expression in P cells located in other body regions. The fusion also began to be expressed in the migratory Cl neuroblasts just before the time of their migration. These data suggest that h-39 acts close to the time that centrally located cells begin to express alternative fates. It also suggests that h-39 functions cell autonomously, as do mabd (Kenyon, 1986) and egl-5 (Chisholm, 1991). Two C. eiegans HOWC Genes with Overlapping Spatial Domains Specify Ceil Fates Combinatoriaiiy In both insects and vertebrates, certain HOM-C genes are expressed in overlapping domains (see McGinnis and Krumlauf, 1992). In insects, it is clear that within these domains of overlap, a single cell can express more than one HOM-C gene. Can HOM-C genes act together to specify individual cell fates? We have found that in the C. elegans male, the fates of two Pn.p cells located where the domains of two HOM-C genes overlap are determined by the combined activities of both HOM-C genes. In the wild-type male, the pattern of Pn.p cell fusion is complex. The most anterior cells, P(1,2).p, and also two more posteriorly located cells, P(7,8).p, undergo cell fusion soon after they are born. In contrast, the intervening Pn.p cells, P(3-6).p and P(9-1 l).p, remain mononucleate. Our previous work led us to suggest that this pattern might be generated by ma&5 acting in combination with another

homeotic gene that functioned in the central body region but had an overlapping spatial domain (Kenyon, 1986). Because h-39 functions within this spatial domain, we asked whether /in-39 functions combinatorially with mab-5 in these cells. As shown in Figure 7, we found that acting alone, lin39(+) and mab-5(+) each prevent cell fusion within their domainsoffunction. In a/in-39(-)mabd(-)double mutant, P(l-1 l).p all fused. In a/in-39(-) mutant, where only mab5(+) is present, the posterior Pn.p cells, P(7-1 l).p, were prevented from fusing. Likewise, in a mab-5(-) mutant, where only h-39(+) is active, the centrally located Pn.p cells, P(3-8).p, were prevented from fusing. When both genes were active (that is, in the wild type), one might have expected a simple additive phenotype, in which all the cells in the combined /in-39 and mab-5 domains (P(311)~) would remain unfused. In fact, the majority of these cells did remain unfused. However, the two Pn.p cells within the region of overlap of mabd and /in-39 activity, P(7,8).p, adopted the alternative fate instead: they now fused. These results indicate that /in-39 and mab-5 together promote a fate (cell fusion) that is different from the fate promoted by either gene acting alone (no cell fusion). Cell fusion is also observed in the anterior Pl .p and P2.p cells, in which neither mab-5 nor /in-39 appears to function. Thus, by an unknown mechanism, the combined activity of /in-39 and mab-5 leads to the same cell fate that is specified if both are inactive. Two HOM-C Genes with Overlapping Spatial Domains Compensate for One Another’s Function in Regulating Ceil Migration The bilateral homologs QL and QR, which migrate in opposite directions, are born where the domains of mab-5 and /in-39 intersect. mab-5 and /in-39 both affect migration of cells in the Q lineage. mab-5 is required for the descendants of QL to migrate posteriorly instead of migrating anteriorly through the central body region (Chalfie et al., 1983; Salser and Kenyon, 1992). As described above, /in39 is required for QR and its descendants to migrate anteriorly through the central body region instead of stopping prematurely. The cell QL expresses both /in-394acZ and mabd-IacZ fusions (see above; Salser and Kenyon, 1992). To learn whether both /in-39 and mab-5 activities might affect QL, we followed Q migrations in the /in-39 mab-5 double mutant. In most respects, the phenotypes were additive (Figure 8). For example, the descendants of QL exhibited both the mab-5 defect (they migrated toward the head instead of the tail) and the /in-39 defect (they stopped migrating prematurely in the central body region). However, the migration of QL, which was nearly always normal in either single mutant, was consistently altered. Instead of migrating a short distance posteriorly and then dividing, QL migrated too far posteriorly, turned around, and migrated anteriorly again before dividing. Apparently, either /in-39 or mabd activity alone is sufficient to cause QL to stop migrating and to divide at the correct place. This implies that in addition to acting in combination to specify an alter-

& elegans

Homeotic

Gene

Cluster

native fate, /in-39 and ma&5 can compensate for one another’s activity in a cell located where their domains overlap. Are There Other HOY-C Genes in the C. elegans Cluster? Using a probe specific for An@-class homeoboxes (Biirglin et al., 1989; see Experimental Procedures), we looked for additional HOM-C genes by probing cosmid and yeast artificial chromosomeclones that together span the cluster (200-300 kb) and in addition extend approximately 200300 kb on each side. In addition to the four known HOM-C genes, this probe detected one new homeobox, ceh-23, located approximately30 kb to the right of the egl-5 homeobox. The sequence of ceh-23 was most similar to the homeoboxes of Drosophila ems (52%) and D/l (500/b), which are only distantly related to Antpclass homeoboxes (Figure 9). To learn where this gene might be expressed, we constructed a ceh-234acZ fusion (see Experimental Procedures) and obtained two independent transgenic lines. In larvae and adults of each line, the fusion was expressed in a number of cell bodies in the nerve ring, including olfactory neurons, and also in two lateral neurons thought to be involved in osmotic regulation (Figure 9; data not shown). No other homeoboxes were found in this screen. However, additional homeoboxes could have escaped detection, first, if their sequences differed significantly from the probe; second, if they had somehow been deleted from these cosmid or yeast artificial chromosome clones; or third, if they were located elsewhere in the genome. Additional C. elegans HOM-C genes may exist, since several pattern elements cannot be accounted for by the known HOM-C genes. For example, the postdeirid neuroblast, located in the posterior body region, and three pairs of male ray sensilla, located in the tail, are not affected by the known HOM-C mutations (Figure 10; Kenyon, 1988; Chisholm, 1991). Discussion The results presented here suggest that the C. elegans HOM cluster is the major determinant of axial diversification during the postembryonic development of this organism. One of the four C. elegans HOM-C genes, ma&, was previously known to be expressed in a position-specific fashion in the posterior body region (Costa et al., 1988), where it functions to generate the structures and cell types that characterize that region (Kenyon, 1988). We have shown that a second HOM-C gene is /in-39, which gives the central body region its identity. Moreover, where the domains of function of /in-39 and mab5 overlap, these genes can act together to specify individual cell fates. We have also found that a third HOM-C gene is egl-5, which specifies ceil fates near the tail. Like mab-5, both /in-39 and eg/-5 are expressed in distinct spatial domains within cells that require their function. Together these three genes function in the same order along the body axis as do their closest homologs in other organisms, suggesting that this patterning system has been highly conserved.

Figure 1Osummarizestheorganizationof theC. elegans HOM complex and the patterns of structures and cell types that these three HOM-C genes, /in-39, mab-5, and e&5, generate along the anteroposterior axis. Some of these structures arise from unique precursor cells, such as the blast cell M, which generates sex-specific muscles. Other structures arise within the lineages of serially repeated cells. This is best exemplified by the P cells, which undergo a basic stereotypical lineage that is modified by activities of the HOM-C genes (see Figure 8A). For example, each P cell generates a Pn.aap descendant whose fate depends on which HOM-C gene acts on it. h-39 acts on Pn.aap cells in the central body region, causing them to become VC neurons in the hermaphrodite (Horvitz et al., 1982; Ellis, 1985) or to generate serotonergic neurons in the male (Loer and Kenyon, 1993). In contrast, mab5 acts on these cells in the posterior of the male, causing them to divide and generate two other kinds of neurons (Kenyon, 1988). The fates of the epidermal Pn.p cells are also diversified by the HOM-C genes. /in-39 prevents the central Pn.p cells in the hermaphrodite from undergoing cell fusion and allows them to become members of the vulva1 equivalence group (see Figure 7). mab-5 prevents a more posterior set of Pn.p cells in the male from undergoing cell fusion and allows them to become members of the male equivalence group, which generates mating structures (Kenyon, 1988). egl-5 activity causes the most posterior Pn.p cell to generate a cell that dies and a preanal hypodermal cell (Chisholm, 1991). The domains of expression of HOM-C genes in many organisms are known to overlap, but the functional significance of this overlap in determining individual cell fates has been unclear. We have found that in male Pn.p cells, mabd acts combinatorially with /in-39 to specify the pattern of Pn.p cell fusion. Both mab5 and h-39 act in the nonoverlapping parts of their domains of function to prevent cell fusion, but where their domains overlap, they cause cells to fuse. In yeast, the al and a2 homeodomain proteins together regulate the expression of diploidspecific genes, whereas alone a2 regulates a different set of genes, haploid-specific genes (Herskowitz, 1989). Our results with C. elegans show that the combined activities of two HO&C genes with overlapping functional domains can determine the fates of individual cells in metazoans. In this way, cells located where two HOM-C domains intersect can acquire new identities that distinguish them from cells in either neighboring body region. Possible mechanisms by which /in-39 and mab-5 interact to specify Pn.p cell fates will be described elsewhere (S. J. Salser and C. K., unpublished data). As well as affecting the fates of stationary cells, C. elegans HOM-C genes may also playageneral role in regulating cell migrations. The HOM-C gene ma&5 regulates cell migration within itsdomain of function in the posterior body region (Chalfie, 1983; Kenyon, 1988; Salser and Kenyon, 1992). Likewise, /in-39 is also required for cell migration through its domain of function in the central body region. Because /in-39 is expressed within these migrating cells, it seems likely that it, like mab5, functions cell autonomously in regulating cell migration. For the most part, lin-

Cell 30

Figure 9. ceh-23 Homeobox Sequence and Expression of a ceh-23-/a& Fusion UI : cxuIRws~ SQLLlaxw x mnnnronr-srwJ-Q49 (A) The ceh-23 homeobox is compared with the : ---1ya *y.gg--m ------mm--&& -p,--m-----&-R-& aoh-23 : -------------R--&-*--p-----Q-CS-#------------y-QA L Drosophila genes ems and DN and murine ho: -------------R-----&-p------Q-~p-& T--------&-~ L mologs EmxllEmx2 and D/x-1/D/x-2. Dashes indicate amino acid identity. It is possible that Dll : BmxPaTIYOE WLCQLWSW PIlrQnusr FSsLMaLGL TQTWXIWQ IWRIx there are additional C. elegans ems-like genes. : p--*---* *+---mm -Iup--QR-~*,--&-----* cab-23 : I-----------*-----*------------------------g---p--> Dir-1 Partial sequence of one homeobox, ceh-2, ex: “--------p-u-*---pm---------------------------+---y mr-2 hibits homology to ems (Biirglin et al., 1969). Sequences have been tabulated from the following sources: ems (Dalton et al., 1989) Emxl/Emx2(Simeoneetal., 1992),D//(Vachon et al., 1992) D/x-l (Priceetal., 1991) andD/x-2 (Porteus et al., 1991). (6) The extrachromosomal ceh-23-lacZfusion was expressed in larvae and adults (data not shown), primarily in bilateral pairs of the amphid olfactory AWC neurons, putative chemosensory ADL neurons, and CAN neurons, which run along the excretory canal of the animal and are thought to have an osmoregulatory function (C. Bargmann, personal communication). The Drosophila ems gene is required for development of the Drosophila trachea, which may play a similar role to the excretory system of nematodes. Approximately 24% (n = 156) of Ll animals examined had the staining pattern shown. Only a subset of these neurons were staining, or staining was less intense in about 71% of the Ll animals. In about 5% of animals expressing the fusion, we observed staining in additional cells along the body.

A

1

10

‘P

7

‘P

‘P

60

Figure 10. Organization and Patterning Function of the C. elegans HOM-C (A) Possible evolutionary relationships, inferred solely from the level of amino acid identity in and around the homeodomain and from the lack of a hexapeptide motif in e&5, are indicated. The tightly linked ceh-13 and /in-39 --genes are separated by an estimated 200300 kb from the tightly linked m&5, eg/-5, and cehceh-13 b-39 mab-5 d-5 ceh-23 23 genes. The ceh-23 (ems/D/I-like) homeobox @h-15) (c&-H) is included because its position in the C. elegans and human clusters appears to be consewed (see text). As in Drosophila, a number of nonH0M-C genes have also been found to QA QL Ll larva map in the C. elegans cluster, such as n&l, which affects nucleolar morphology (A. C., una a a a.% % 1.I. .I published data). The directions of transcription Pl 2 3 4 5 6 7 8 19 I10 11 1 were inferred from DNA sequence and restricM v5 V6 tion analysis. The orientation of ma&5 transcription was determined by S. J. Salser. (B) Diagram of a newly hatched Ll larva (male and hermaphrodite combined) showing cells relative len(llh of Ll larva: that require the function of/in-39(green), mab-5 (red), and eg/-5(blue) to develop correctly. The HSN is drawn in its birthplace in the tail. These cells give rise to the structures shown in the adult in (C). (C) Diagram of an adult nematode, again color OR descendants Pd cc PVC coded to represent the HOM-C gene(s) re@adult / I I t s~kx~lea DVB / quired for their development. Structures only rarely affected by a HOMC gene (lCt% or less) are not color coded for this gene. Considerable cell migration takes place during development h&k riys (dashed arrows), so that structures generated by different HOM-C genes become intermingled. The migration of the rays is not indicated; ray precursor cells migrate into the tail from the position of V5-V6. Not all structures and cell types specified by the HOMC genes are shown. Most of these additional structures, such as additional male sex muscles, position-specific cell deaths, and the mate proctodeum, arise within the primary domains of the HOM-C genes whose function they require. One structure not shown, the male gonad, is exceptional. This develops in the central/posterior body region in an eg&dependent fashion. Both mab-5 and eg/-5 affect the lineages of M and V6; however, it is not clear whether these two genes act in the same cells within these lineages. A few of the pattern elements generated postembryonically, such as the three most posterior rays and the postdeirid (Pd), are not affected by //n-39, mab-5, or e&5 mutations. Figure 8 shows how some of these position-specific structures and cell types arise from the P lineages in the Ll cc, coelomocyte; PVC, PDA, DVB, and the Q descendants, neurons; CP, serotonergic Pn.aapp neurons; h, hypodermal cell. Compiled from data in this paper; Horvitz et al., 1962; Kenyon, 1966; Chisholm, 1991; and Loer and Kenyon, 1993.

Worm

;9 elegans

Homeotic

Gene

Cluster

39 and mab-5 act independently to control cell migration. However, in QL, which migrates through the zone of overlap of the h-39 and mab-5 domains, these two genes can compensate for one another’s function. This is a second example of how cells located in the overlapping region of two HOM-C gene domains may come under the control of both genes. A different type of HOM-C gene interaction, cross-inhibition of HOM-C gene activity, will be described elsewhere (S. J. Salser and C. K., unpublished data). HOM-C genes in C. elegans seem to function primarily during postembryonic development. Putative null mutations in /in-39, mab-5, and egl-5 do not affect early embryonic development, suggesting that their functions are not important for establishing the basic body plan of the animal. The same is true of arthropod and vertebrate embryos, whose HOM-C genes are also activated after their basic body plans are established. It is noteworthy that HOM-C genes are not required for the early developmental events in C. elegans that differ from those of arthropods and vertebrates. Unlike these other organisms, the early cleavages of nematode embryos generate blastomeres that are intrinsically different from one another and then undergo extensive local cell-cell interactions. This particular strategy of early embryogenesis must have arisen within the context of the ancestral HOM-C system without disrupting its ability to generate pattern. It will be interesting to learn how mechanisms for activating HOM-C genes along the body axis have changed during evolution to accommodate themselves to this particular type of early embryogenesis. What did the HOM cluster in the ancestral precursor of nematodes, arthropods, and vertebrates look like? Knowledge of the Lin-39 and Egl-5 protein sequences, along with the tentative conclusion that the C. elegans HOM cluster contains only four genes, allows us to speculate (Figure 10). Previous sequence analysis has shown that the gene thought to be leftmost in the cluster, ceh-73, is a labial (lab) homolog (Schaller et al., 1990). The function of this gene is not known, but it would be predicted to act in the anterior body region. The next gene, h-39, exhibits blocks of homology to three neighboring Drosophila HOM-C genes, p&r, Dfd, and Ser. It may be that these three genes are closely related to one another. They may have proliferated within a precursor that gave rise to insects and vertebrates following its separation from nematodes. Alternatively, they could each have been present in the original common ancestor but later been compressed into one gene by recombination and deletion events in a line leading to nematodes. The next gene, mab-5, most closely resemblesAn@, one of three genes (An@, Ubx, and abd-A) thought to have diverged from one another fairly recently in arthropod evolution. Finally, as discussed above, the rightmost gene, eg/-5, may be best classified as an Abd-B homolog. Together the data suggest that the ancestral cluster contained at least four HOM-C genes: one precursor of lab; one, although possibly more, precursor(s) of pb, Dfd, and Scr; one precursor of Antp, Ubx, and abd-A; and one precursor of Abd-6. The identification of an ems/D/i homolog near the C. elegans HOM cluster raises the possibility that the ances-

tral HOM-C also regulated head development. The Drosophila ems and D/l homologs are not located in HOM clusters. However, one vertebrate ems homolog, fMX7, is located in an analogous position just 5’ of the HOX4 locus (E. Boncinelli, personal communication), raising the possibility that this gene was present in the ancestral HOM cluster. The Drosophila and vertebrate ems homologs are expressed in olfactory cells (Dalton et al., 1989; Simeone et al., 1992), and Drosophila ems is known to be required for olfactory development (Dalton et al., 1989). Although the function of the C. elegans homolog is unknown, its inferred expression pattern in olfactory neurons is provocative. It raises the possibility that ar: ems precursor generated olfactory neurons in a common ancestor of nematodes, arthropods, and vertebrates and that this function has been retained throughout evolution. Experimental General

Procedures

Procedures,

Nomenclature,

and Strains

Methods for routine culturing and genetic analysis are described by Brenner (1974) and Wood et al. (1986). All analyses were performed at 20%. In most cases, males were generated using him5(e7490), which increases the frequency of male self-progeny.

Isolation

and Sequencing

of egl.5 and /in-3g CONAS

To identify eg/-5 cDNA clones, the eg/-5 homeobox was amplified by polymerase chain reaction (PCR) from a cosmid subclone and was used to probe 800,000 clones of a cDNA library constructed using mRNA from him-B(e7489) embryos (gift of L. Miller and B. Meyer). We isolated two independent cDNA clones (about 1 .O kb) that had different length poly(A) tails. Both strands of one cDNA and one strand of corresponding genomic DNA were sequenced using Sequenase 2.0 (US Biochemical) and an Applied Biosystems automated sequencer at the Eiomolecular Resource Center, University of California, San Francisco To identify /in-39 cDNA clones, a 3 kb EcoRl cosmid subclone that contains part of the /in-39 homeobox was used to probe 1 x 1 od clones of a mixed-stage cDNA library (gift of B. Barstead and R. Waterston) and 800,000 clones of the hind(e7489) embryonic cDNA library. Two independent cDNA clones from the first librav and a number of cDNA clones from the second library were isolated. Both strands of the longest cDNA (1.3 kb) and one strand of corresponding genomic DNA were sequenced as described above for egl-5. The homeobox oligonucleotide H&l (Biirglin et al., 1969) was used to probe a set of overlapping cosmids (X104, C45B1, CO7B9, Cl 7A3, C47812, FOBFE. T20B12, C3OC5, C03D10, C29A11, F44F12, C34A5, K07D8, TO4A8, ZK420, T07B11, C55A11, ZK694, C44F11, CO3B8, ZC31, C33C3,ZC97, ZC102,ZK886, CO6C3, C27Dll. M22, C35F6, C39F10, C44C9,ZK5811 and CO4Hll) and total yeast DNA-containing yeast artificial chromosomes c/5081 1, Y69F12, Yl6G1, YE1 HE, and Y39H8) expected to span the cluster region. A number of hybridizing bands were sequenced. However, other than c&23 (located on cosmid C27Dll), no homeoboxes were discovered. To identify ceh-23 cDNA clones, part of the ceh-23 homeobox was amplified by PCR from genomic DNA and was used to probe 2 x 1Q clones from a mixed-stage cDNA library. A single cDNA clone was isolated. Both strands of the cDNA and one strand of corresponding genomic DNA were sequenced as described above for egl-5. Sequence comparisons were done by visual inspection and by using the FASTA program (Pearson and Lipman, 1968) on the NBRFlPlR protein data base (version 33). Figure 10 was color coded with Swiss-made Caran D’Ache Aquarell color pencils.

Construction l/n-3WacZ

and Analysis Fuslons

of the egl-WacZ

and

We attempted to make egl-5-/ecZ and /in-39-IacZ accurately reflect the wild-type expression pattern

fusions that would by including large

Cdl 40

regions upstream and downstream of each gene, as well as all introns. Other experiments have shown that for mab-5, a 7 kb upstream fragment is sufficient to reproduce the normal expression pattern (visualized with an antibody) in most cells, suggesting the HOM-C regulatory regions in C. elegans may be relatively small (S. J. Salser and C. K., unpublished data). In addition to the fusions described below, we also constructed fusions that had shorter upstream and downstream regions These had localized staining patterns but also frequently showed additional expression in other body regions. A 4.1 kb fragment from the expression vector pPD21.28 (Fire et al., 1990) that contains the laci’gene, simian virus 40 nuclear localization signal, and uric-54 3’ untranslated region was cloned into a unique Apal site of a 22 kb Nrul subclone of cosmid CO8C3 that contains the entire egl-5 coding sequence (see Figure 2). The resulting egl-WacZ fusion has about 17 kb upstream of the first predicted AUG and 3 kb downstream of the predicted stop codon. To construct the /in-39-/acZ fusion, pPD21.28 was modified by removing the synthetic intron, simian virus 40 nuclear localization signal, and the uric-54 3’ untranslated region. The 3.3 kb fragment containing only the /acZ gene was then cloned into a unique Ncol site in a 24 kb Pstl-EcoRV subclone of cosmid F44F12 that contains /in-39. The resulting /in-39-/acZ fusion has about 13 kb upstream of the first predicted AUG and 3.8 kb downstream of the predicted stop codon. To construct a /acZ fusion likely to represent the actual expression pattern of the gene containing ceb-23, we first isolated a relatively large (approximately 13 kb) fragment of genomic DNA whose 3’terminus was a SnaBl site within the ceh-23 homeobox. We joined the 3’ end of this fragment in frame to /acZ using the pPD21.28 vector. We are not certain that this fusion contains all the regulatory sites required for authentic ceh-23 expression. The /acZ fusion constructs (50 pg/ml for egl-5-lacZ; 15 uglml for /in-39-/acZ; 30 ug/ml for ceh-2%/acZ) were injected into him-5(e7490) or N2 (wild-type) hermaphrodites along with the fol-6(su1006) coinjection marker (100 pglml). Two lines that expressed the coinjection marker (Rol) were analyzed for each fusion. The DNA was not stably inherited in these lines, implying that it was maintained as an episome. The majority of Aol animals exhibited the staining patterns described in Figures 1 and 4. We integrated the /in-39-laczfusion into chromosome IV using y irradiation (C. Kari, A. Fire, and R. Herman, personal communication); expression in this line was similar to that in lines carrying the DNA extrachromosomally. This line, muls6, was used for analysis of /in-39-/acZ expression in larvae. To assay for f&galactosidase, embryos were gently scraped from plates, allowed to settle on polylysine-coated slides, and frozen on dry ice. The embryos were dehydrated in -20% acetone and rehydrated by a 90%, 60%, 30%. 10% acetone series at room temperature. The embryos were then incubated in staining solution from 3 hr to overnight at room temperature as described (Fire, 1992). Larvae were stained as described previously, with minor modifications (Fire et al., 1990). Molecular Cloning of egl-5 and lln-39 egl-5: Initially, cosmids containing ceh-77 were used to attempt to rescue the egg-laying defect (Egl) phenotype of egM(u202) using rol6(.su7606) as a coinjection marker (100 &ml). One cosmid, CO8C3 (10 &ml), fully rescued the Egl phenotype in 4 of 9 Fl transformants (progeny of injected animals). The ceh-11 homeobox from 11 egl-5 alleles (n945, n988, ~202, e2399, e2502, e2506, e2508, n989, n486, e2495, and n7439) was amplified by PCR and sequenced by doublestranded DNA cycle sequencing (Bethesda Research Laboratories). In this initial screen, we detected DNA changes in three alleles, ~202, e2495, and n486. To eliminate the possibility of PCR artifacts and to confirm the changes, both strands from three independent PCR reactions were sequenced for these alleles. l/n-39: We used cosmids containing ceb-75 to attempt to rescue the Vul phenotype of mu26 using ro/-6(su7006) as a cotnjection marker (100 uglml). Cosmid F44F12 (30 pglml) fully rescued the Vul phenotype (a functional vulva was present and eggs were laid) in 2 out of 8 Fi transformants, and a smaller subclone (10 kb; 10 uglml) also containing the homeobox fully rescued 18 out of 34 Fl transformants. (Because /in-39 hermaphrodites are Vul, complementation testing was done using /in-39(n7760) following transformation rescue.) To show

that the ceh-75 homeobox was part of /in-39, a mutation was made at a unique Ncol site that occurs in the predicted hexapeptide sequence of ceh-15. The mutation was created in the 10 kb subclone by digesting with Ncol, filling in with Klenow, and religating. Sequencing showed that the resulting mutation was an opal stop codon. This mutant 10 kb fragment (ANco4; 10 @ml) failed to rescue (0 of 71 Fl transformants). To show that this nonsense mutation was likely to be responsible for the failure to rescue, a 1 kb wild-type fragment (20 ugl ml) that overlaps the mutation was coinjected along with ANco4 into mu26 animals. The Vul phenotype was partially rescued in 1 of 34 Fl transformants (a nonfunctional vulva was present; no eggs were laid). In addition, a mutation was found in the ceh-75 homeobox of the mu26 allele when the homeobox was amplified by PCR and sequenced by double-stranded DNA cycle sequencing (Bethesda Research Laboratories). Sequencing of both strands from three independent PCR reactions confirmed this mutation. Construction of h-39 mab-5; him-5 Mutants and lmmunofluorescence To construct the double mutant carrying the closely linked M-39 and mab-5 mutations, we identified non8ma Uric recombinants from heterozygotes of genotype +/in-39@7760) ++/sma-3(e497) +mab5(e7239) uric-36(e257); himG(e7490) using a dissecting microscope. In 2 of 22 Uric non-Sma recombinants, recombination had occurred between /in-39 and mab-5, allowing the isolation of /in-39 mab-5 uric-36 hermaphrodites. To remove the uric-36 mutation, we constructed +/in39mab-5unc-36/sma-3+++; him-5 hermaphroditesandscreened their progeny for Lin-39 Mab-5 non-Uric recombinants. The homozygous triple mutant /in-39(n7760) maM(e7239); him-5(e7490) is unhealthy, grows slowly, and is moderately Uric except at larval stage Ll Immunofluorescence with the mouse monoclonal antibody MH27, which stains a component of septate junctions (Francis and Waterston, 1991) was performed essentially as described (Kenyon, 1986). Acknowledgments We thank S. G. Clark and H. R. Horvitz for sharing unpublished information about /in-39 with us prior to publication and for sending us /in-39@7760). We thank J. Sulston, A. Coulson, J. Hodgkin, and other Medical Research Council colleagues for help with the initial egl-5 rescue experiments. We thank 8. Barstead, L. Miller, and B. Meyer for providing the cDNA libraries and 8. Barstead and R. Waterston for MH27 antibodies. We also thank S. J. Salser and C. Hunter for help in sequence comparisons and S. J. Salser, C. Hunter, and C. Bargmann for help in identifying cells expressing /acZ. We thank all the members of the Kenyon lab for many stimulating discussions, much technical advice, and commentson the manuscript. Some nematode strains used in this work were provided by the Caenorhabditis Genetics Center, which is funded by the National Institutes of Health National Center for Research Resources. 6. B. W. and N. T. R. were supported by predoctoral fellowships from the National Science Foundation. N. T. R. was also supported by a University of California, San Francisco graduate opportunity fellowship. M. M. M.-l. was supported by a long-term fellowship from the European Molecular Biology Organization. J. A. was supported by the Jane Coffin Childs Memorial Fund and the Bank of America-Giannini Foundation. A. C., whose work was performed in J. Hodgkin’s laboratory, was a holder of a Medical Research Council studentship. This work was supported by a National Institutes of Health grant to C. K., who is a Packard Foundation Fellow. Received

January

21, 1993; revised

April 23, 1993.

References Akam, control Brenner,

M., Dawson, I., and Tear, G. (1988). Homeotic genes and the of segment diversity. Development (Suppl.) 704, 123-133. S. (1974).

The genetics

of C. elegans.

Genetics

Biirglin, T. R., Finney, M., Coulson, A., and Ruvkun, rhabditis elegans has scores of homeobox-containing 34 7, 239-243.

77, 71-94.

G. (1989). genes.

CaenoNature

C. elegans 41

Homeotic

Gene

Cluster

Biirglin, T. Ft., Ruvkun, G., Co&on, A., Hawkins, N. C., McGhee, J. D., Schaller, D., Wittmann, C., Miiller, F., and Waterston, Ft. H. (1991). Nematode homeobox cluster. Nature 357, 703. Celniker, genetics products

S. E., Keelan, D. J., and Lewis, E. 8. (1969). The molecular of the bithorax complex of Dro.sophi/a: characterization of the of the Abdominal-B domain. Genes Dev. 3, 1424-1436.

Chalfie, M., Thomson, N. J., and Sulston, J. (1963). ronal branching in Caenorhabdifis elegans. Science

Induction of neu227, 61-63.

Chisholm, A. (1991). Control of cell fate in the tail region by the gene egl-5. Development 177, 921-932.

of C. elegans

Clark, S. G., Chisholm. A. D., and Horvitz, H. R. (1993). Control of cell fates in the central body region of C. elegans by the homeobox gene /in-39. Cell 74 (this issue). Costa, M., Weir, M., Coulson, A., Sulston, J., and Kenyon, C. (1966). Posterior pattern formation in C. elegans involves position-specific expression of a gene containing a homeobox. Cell 55, 747-756. Cowing, D., and Kenyon, C. (1992). Expression of the homeotic gene mab-5 during Caenorhabditis elegans embryogenesis. Development 776, 461-490. Cribbs, D. L., Pultz, M. A., Johnson, D., Mazzulla, T. C. (1992). Structural complexity and evolutionary Drosophila homeotic gene proboscipedia. EMBO

M., and Kaufman, conservation of the J. 77, 1437-1449.

Dalton, D., Chadwick, R., and McGinnis, W. (1969). Expression and embryonic function of empty spiracles: a Drosophile homeobox gene with two patterning functions on the anterior-posterior axis of the embryo. Genes Dev. 3, 1940-1956. Diederich, R. J., Merrill, V. K., Pultz, M. A., and Kaufman, T. C. (1969). Isolation, structure, and expression of labial, a homeotic gene of the Anfennapedia complex involved in Drosophila head development. Genes Dev. 3.399-414. Ellis, H. (1965). Genetic control of programmed cell death in the nematode Caenorhabditis elegans. PhD thesis, Massachusetts Institute of Technology, Cambridge, Massachusetts. Fibi, M., Zink, B., Kessel, M., Colberg-Poley, A. M., Labeit, S., Lehrach H., and Gruss, P. (1966). Coding sequence and expression of the homeobox gene Hex 7.3. Development 702,349359. Fire, A. (1992). Histochemical tosidase activity in transgenic 9, 151-156.

techniques organisms.

for locating E. co/i 6-galacGene Anal. Techn. Applic.

Fire, A., Harrison, S. W., and Dixon, D. (1990). A modular set of /acZ fusion vectors for studying gene expression in Caenorhabditiselegans. Gene 93, 169-l 96. Francis, R., and Waterston, R. H. (1991). Muscle cell attachment Caenorhabditis e/egans. J. Cell Biol. 774, 465-479.

in

Frohman, M. A., Boyle, M., and Martin, G. R. (1990). Isolation of the mouse Hex-2.9 gene; analysis of embryonic expression suggests that positional information along the anterior-posterior axis is specified by mesoderm. Development 7 70, 569-607. Galliot, B., Doll& P., Vigneron, M., Featherstone, hf. S., Baron, A., and Duboule, D. (1969). The mouse Hex-7.4 gene: primary structure, evidence for promoter activity and expression during development, Development 707, 343-359. Gehring, W. J., Miiller. M., Affolter, M., Percival, S. A., Billeter, M., Sian. Y. Q., Otting, G., and Wuthrich, K. (1990). The structure of the homeodomain and its functional implications. Trends Genet. 6, 323329. Hawkins, N. C., and McGhee, J. D. (1990). Homeobox containing genes in the nematode Caenorhabditis elegans. Nucl. Acids Res. 78, 6101-6106. Herskowitz, I. (1969). A regulatory yeast. Nature 342, 749-757.

hierarchy

for cell specialization

in

Horvitz, H. R., Ellis, H. M., and Sternberg, P.W. (1962). Programmed Cell death in nematode development. Neurosci. Comment. 1, 56-65. Izpisua-Belmonte, J.-C., Falkenstein, H., Doll& P., Renucci, A., and Duboule, D. (1991). Murine genes related to the Drosophila Ab&B homeotic genes are sequentially expressed during development of

the posterior

part of the body.

EMBO

J. 70, 2279-2269.

Kamb, A., Weir, M.. Rudy, B., Varmus, H., and Kenyon, C. (1969). Identification of genes from pattern formation, tyrosine kinase, and potassium channel families by DNA amplification. Proc. Natl. Acad. Sci. USA 86, 4372-4376. Karch, F., Bender, W., and Weiffenbach, B. (1990). in Drosophila embryos. Genes Dev. 4, 1573-1567.

abd-A

Kenyon, C. (1966). A gene involved in the development body region of C. elegans. Cell 46, 477-467. Kenyon, C., and Wang, B. (1991). homeobox genes in a nonsegmented

A cluster animal.

expression

of the posterior

of Anfennapedie-class Science 253,516517.

Kongsuwan, K., Allen, J., and Adams, J. M. (1969). Expression of /-/0x-2.4 homeobox gene directed by proviral insertion in a myeloid leukemia. Nucl. Acids Res. 77, 1661-1692. Kornfeld, K., Saint, R. B., Beachy, P. A., Harte, P. J., Peattie, D. A., and Hogness, D. S. (1989). Structure and expression of a family of Utrebithorex mRNAs generated by alternative splicing and polyadeny lation in Drosophila. Genes Dev. 3, 243-256. Laughon, A., Boulet, A. M., Bermingham. J. R., Laymon, R. A., and Scott, M. P. (1966). Structure of transcripts from the homeotic Antennepadia gene of Drosophila melanogester: two promoters control the major proteincoding region. Mol. Cell. Biol. 6, 4676-4669. LeMotte, P. K., Kuroiwa, A., Fessler, L. I., and Gehring, W. J. (1969). The homeotic gene Sex Combs Reducedof Drosophila: gene structure and embryonic expression. EMBO J. 8, 219-227. Loer, C. M., and Kenyon, C. (1993). Serotonindeficient mutants male mating behavior in the nematode C. ekgens. Neuroscience, press. McGinnis, patterning.

W., and Krumlauf, Cell 68, 263-302.

Meijlink, F., deLaaf, R.,Verrijzer, J., and Deschamps, J. (1967). on chromosome 11: sequence Acids Res. 15, 6773-6766. Pearson, sequence

R. (1992).

Homeobox

genes

and in

and axial

P., Destree, O., Kroezen, V., Hilkens, A mouse homeobox containing gene and tissue-specific expression. Nucl.

W. R., and Lipman, D. J. (1966). Improved tools for biological comparison. Proc. Natl. Acad. Sci. USA 85, 2444-2448.

Porteus, M. H., Bulfone, A., Ciaranello, R. D., and Rubenstein, J. L. (1991). Isolation and characterization of a novel cDNAclone encoding a homeodomain that is developmentally regulated in the ventral forebrain Neuron 7, 221-229. Price, M., Lemaistre, M., Pischetola, M., Di, L. R., and Duboule, D. (1991). A mouse gene related to Distal-less shows a restricted expression in the developing forebrain. Nature 357, 746-751. Regulski, M., McGinnis, N., Chadwick, R., and McGinnis, W. (1967). Developmental and molecular analysis of Deformad; a homeotic gene controlling Drosophila head development. EMBO J. 6, 767-776. Salser, S. J., and Kenyon, C. (1992). Activation of a C. e/egansAntennepedie homolog in migrating cells controls their direction of migration. Nature 355, 255-256. Schaller, D., Wittmann, C., Spicher, A., Muller, F., and Tobler, H. (1990). Cloning and analysis of three new homeobox genes from the nematode Ceenorhebditis elegens. Nucl. Acids Res. 78, 2033-2036. Schughad, K., Utset, M. F., Awgulewitsch, A., and Ruddle, F. H. (1966). Structure and expression of /-/0x-2.2, a murine homeoboxcontaining gene. Proc. Natl. Acad. Sci. USA 85, 5582-5566. Scott, M. P. (1992). 77, 551-553.

Vertebrate

homeobox

gene

nomenclature,

Cell

Scott, M. P., Tamkun. J. W., and HartzelI, G., Ill (1969). The structure and function of the homeodomain. Biochim. Biophys. Acta 989, 2546. Simeone, A., Gulisano, M., Acampora, D., Stornaiuolo, A., Rambaldi, M., and Boncinelli, E. (1992). Two vertebrate homeobox genes related to the Drosophila amptyspirac/es gene are expressed in the embryonic cerebral cortex. EMBO J. 77, 2541-2550. Sutston, J. E., and Horvitz, H. R. (1977). Post-embryonic cell lineages of the nematode, Caenorhabditis elegens. Dev. Biol. 56, 110-156

Cell 42

Tan, D-P., Ferrante, J., Nazarali, A., Shao, X., Kozak, C. A., Guo, V., and Nirenberg, M. (1992). Murine Hex-7.77 homeobox gene structure and expression. Proc. Natl. Acad. Sci. USA 89, 6289-6264. Vachon, G., Cohen, B., Pfeifle, C., McGuffin, M. E., Botas, J., and Cohen, S. M. (1992). Homeotic genes of the Bithorax complex repress limb development in the abdomen of the Drosophila embryo through the target gene DistaCless. Cell 77, 437-469. Wood, W. B. (1988). Appendix 4: genetic map. In The Nematode Caenorhabditis elegans, W. B. Wood, ed. (Cold Spring Harbor Laboratory, New York: Cold Spring Harbor Laboratory Press), pp. 491-584.

GenBsnk

Accession

Numbers

The accession numbers for the eg/-5, h-39, and ceh-23 sequences reported in this paper are L19247, L19246, and L19249, respectively.

Note Added

in Proof

The genomic positions of the C. elegans HOM-C genes were inferred from cross-hybridization data obtained during the C. elegans genome mapping project (Biirglin et al., 1991; Kenyon and Wang, 1991). These positions are now being confirmed by the C. elegans genome sequencing project. The positionsofmab5and egl-5 have now been confirmed; the positions of ceh-73 and h-39 relative to one another have not yet been determined unambiguously.

Suggest Documents