Integrating Interactome, Phenome, and ... - Semantic Scholar

6 downloads 0 Views 238KB Size Report
Nov 19, 2002 - Hartley, J.L., Temple, G.F., and Brasch, M.A. (2000). DNA cloning ... Fancher, A.M., Hodges, P.E., Kondu, P., Lengieza, C., Lew-. Smith, J.E. ...
Current Biology, Vol. 12, 1952–1958, November 19, 2002, 2002 Elsevier Science Ltd. All rights reserved.

PII S0960-9822(02)01279-4

Integrating Interactome, Phenome, and Transcriptome Mapping Data for the C. elegans Germline Albertha J.M. Walhout,1 Je´roˆme Reboul,1 Olena Shtanko,1 Nicolas Bertin,1 Philippe Vaglio,1 Hui Ge,1 Hongmei Lee,2 Lynn Doucette-Stamm,2 Kristin C. Gunsalus,3,4 Aaron J. Schetter,4 Diane G. Morton,4 Kenneth J. Kemphues,4 Valerie Reinke,5 Stuart K. Kim,6 Fabio Piano,3,4 and Marc Vidal1,7 1 Dana-Farber Cancer Institute and Department of Genetics Harvard Medical School Boston, Massachusetts 02115 2 Genome Therapeutics Waltham, Massachusetts 02154 3 Department of Biology New York University New York, New York 10011 4 Molecular Biology and Genetics Cornell University Ithaca, New York 14850 5 Department of Genetics Yale University New Haven, Connecticut 06520 6 Department of Developmental Biology and Genetics Stanford University Palo Alto, California 94305

Summary By integrating functional genomic and proteomic mapping approaches, biological hypotheses should be formulated with increasing levels of confidence [1–5]. For example, yeast interactome and transcriptome data can be correlated in biologically meaningful ways [6– 9]. Here, we combine interactome mapping data generated for a multicellular organism with data from both large-scale phenotypic analysis (“phenome mapping”) and transcriptome profiling. First, we generated a twohybrid interactome map of the Caenorhabditis elegans germline by using 600 transcripts enriched in this tissue [10]. We compared this map to a phenome map of the germline obtained by RNA interference (RNAi) [34] and to a transcriptome map obtained by clustering worm genes across 553 expression profiling experiments [11]. In this dataset, we find that essential proteins have a tendency to interact with each other, that pairs of genes encoding interacting proteins tend to exhibit similar expression profiles, and that, for ⵑ24% of germline interactions, both partners show overlapping embryonic lethal or high incidence of males RNAi phenotypes and similar expression profiles. We propose that these interactions are most likely to be relevant to germline biology. Similar integration of interactome, phenome, and transcriptome data should be

7

Correspondence: [email protected]

possible for other biological processes in the nematode and for other organisms, including humans.

Results and Discussion We reasoned that integration of interactome, phenome, and transcriptome mapping data for multicellular organisms would be most informative if one concentrated on a particular cell type or tissue. We selected the C. elegans hermaphrodite germline, since its formation and function require many fundamental biological processes, such as DNA replication, mitosis, apoptosis, and DNA repair. To select an initial set of genes with a high likelihood of functional involvement in the germline, we used available microarray data [10]. Particularly, we focused on 762 germline-intrinsic and oocyte-enriched genes (the “germline” genes) [10] as a starting point for the generation of an interactome map. To clone the predicted full-length protein-encoding open reading frames corresponding to germline genes (gORFs), we used the Gateway recombinational cloning technology [12–14]. A total of 600 gORFs were PCR amplified successfully from a highly representative worm cDNA library, Gateway cloned, and subsequently transferred into two yeast two-hybrid plasmids expressing either DNA binding domain (DB) fusion proteins (DB-gORFs) or activation domain (AD) fusion proteins (AD-gORFs) (see the Experimental Procedures). Most of the gORFs that could not be amplified correspond to exon structure mispredictions and will require improved genome annotations before they can be cloned [15]. We then selected DB-gORF/AD-gORF potential interactors among all DBgORF and AD-gORF pairwise combinations (600 ⫻ 600 ⫽ 3.6 ⫻ 105 ) by using a relatively stringent and high-throughput version of the yeast two-hybrid system [16–18] (see the Experimental Procedures). In all, we identified 65 two-hybrid interactions involving 62 proteins (Table 1). The number of interactions detected over all possible pairwise combinations in the germline map (65/3.6 ⫻ 105 ⫽ 1.8 ⫻ 10⫺4) was very close to the number of interactions (ⵑ400) detected with ⵑ130 baits screened against an AD-cDNA library in 3 previously published C. elegans interactome maps [13, 19, 20] (ⵑ400/2.2 ⫻ 106 ⫽ 1.7 ⫻ 10⫺4), and this finding indicates that similar levels of saturation were obtained with both approaches. The potential protein-protein interactions were organized into interaction networks (Figure 1). A total of 17 homodimers were identified, and 6 of the 48 putative heterodimers were found in both DB-X/AD-Y and DBY/AD-X two-hybrid configurations, leaving 42 distinct heterodimers. Interestingly, detailed genetic characterization is currently available for only 5 of the 62 gORFs engaged in two-hybrid interactions [21]. This illustrates how interactome data can provide functional annotations for previously uncharacterized predicted gene products [3]. The map contains one potential interolog [13] (C27H6.2/T22D1.10), since the corresponding or-

Brief Communication 1953

Table 1. Protein Interaction Pairs DB

AD

Times Found

T09B4.2 C07H6.5 C36C9.1 C52E4.4 C27H6.2 T22D1.10 F10D11.2 Y11D7A.13 F35G12.12 C05C8.6 F54D10.7 R07B7.2 F52C6.3 C36C9.1 M04B2.3 C16C8.11 T05H4.14 F28D1.2 F46A9.4 T05H4.14 C36C9.1 Y45F10A.2 T05B9.1 F41H10.10 T24D1.3 C28D4.3 C05C10.5 R06B9.3 R11A5.2 T20G5.11 C50C3.8 C50E3.13 D1046.1 C07H6.5 T27F2.1 T27F2.1 R06B9.3 R07B7.2 F26H11.4 ZC168.4 R06B9.3 C05C10.5 F46A9.4 T26A5.7 F39H2.4 F39H2.4 F39H2.4 ZC513.6 T01B7.5 Y45F10A.2 F54D5.9 T05H4.14 F26H11.4 F57C9.5 C49F5.6 C36C9.1 C36C9.1 Y43E12A.1 C07H6.5 T12E12.4 F11A3.2 C36C9.1 C13F10.7 F52C6.3 F42H10.7

T09B4.2 R05D11.8 F28D1.2 F35G12.12 T22D1.10 C27H6.2 F10D11.2 Y11D7A.12 C52E4.4 C05C8.6 F54D10.7 F41H10.10 F52C6.3 C27A2.6 T07G12.6 C16C8.11 F52C6.3 H02I12.5 F54D5.9 F54D10.7 C50E3.13 ZC168.4 T05B9.1 R07B7.2 M7.5 C28D4.3 C27A2.6 R06B9.3 ZK1055.1 T20G5.11 C50C3.8 C50E3.13 D1046.1 W03C9.7 C27A2.6 F28D1.2 H02I12.5 R07B7.2 F28D1.2 Y45F10A.2 T12E12.4 F54D10.7 F02D10.6 W03D2.4 F11E6.2 F28D1.2 F30F8.3 F39H11.1 F54D10.7 Y43E12A.1 F46A9.4 H02I12.5 F54D10.7 F41H10.10 T14G10.6 M7.2 R07B7.2 Y45F10A.2 C27A2.6 T12E12.4 C50E3.4 C36C9.1 C13F10.7 T07G12.11 C27A2.6

309 62 40 39 37 33 31 29 29 23 22 16 15 15 14 12 12 10 9 9 9 8 6 5 4 4 4 4 3 3 3 3 3 3 3 3 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Homodimers are indicated in bold.

thologous yeast proteins have also been reported to interact [21]. This underscores the observation that potential protein-protein interactions identified in C. elegans may have predictive value for other species [22]. Interestingly, several proteins found in the germline map were also identified in our previously published interactome maps [13, 19, 20] (Figure 1). Multiple interactions with diverse partners are consistent with the pleiotropic roles exerted by many proteins. For example, Ras pathway proteins that regulate vulval development are also known to be required for meiotic progression in the germline [23]. In addition to 10 isolated interactions and 2 pairs of contiguous interactions that connect 3 proteins, 28 of the 42 distinct heterodimers (67%) were found to form a single interaction network of contiguous interactions (Figure 1). This network contains six circular interaction clusters, in which a protein is connected to itself via interactions with other potential partners (i.e., X, Y, Z, …X). This may point to biologically relevant regulatory pathways or protein complexes [13]. All interaction features, including integration with other functional genomic data (see below), can be found by using interactomedb at http://vidal.dfci.harvard.edu/ interactomedb.pl (N.B. et al., unpublished data). The interpretation of large-scale two-hybrid mapping data can be complicated by the occurrence of false negatives and false positives [3, 24]. For example, interactions are probably missing in our “core” germline interactome map (i.e., false negatives). To address this issue, additional proteomic screens will have to be carried out [3]. On the other hand, it is possible that some of the potential interactions shown here are not taking place in vivo (i.e., false positives). Below, we describe how integration of protein interaction maps with other functional genomic data might facilitate their interpretation [5]. Genuinely interacting proteins are expected to perform related cellular functions, and thus loss-of-function alleles of their respective genes are expected to confer overlapping (or opposite) phenotypes [20]. Hence, we combined the interaction map with “phenome” mapping data generated by high-throughput RNAi analysis of the same set of germline genes [25, 34] (Figure 2A). First, we investigated whether interactome and phenome data can be correlated globally by considering the number of interactions observed between the products of gORFs conferring embryonic lethal (Emb) phenotypes (“intraEmb interactions”). Since ⵑ43% (253/587) of gORFs scored as Emb [34], a uniform distribution predicts that ⵑ18% of heterotypic interactions in the germline should be intra-Emb (see the Experimental Procedures). However, we observed a significant enrichment for intraEmb interactions in the core germline interactome map (17/42 ⫽ ⵑ40%, p value ⫽ 5.5 ⫻ 10⫺4) (Figure 3). In general, different essential proteins can be involved in unrelated molecular processes. However, the gORFencoded products in particular are enriched for proteins involved in germline biology and early embryogenesis [10]. Thus, it is possible that intra-Emb interactions are more likely to be relevant than others. In any case, the general observation that essential proteins in the germline dataset are more likely to interact with each other than with nonessential proteins suggested that large-

Current Biology 1954

Figure 1. A Core Germline Interactome Map The germline protein-protein interactions were visualized as an interaction network [33]. The direction of the arrows indicates the two-hybrid configuration from DB-X to AD-Y. Red indicates proteins encoded by ORFs that have been genetically characterized in some detail [21]. Gray indicates uncharacterized proteins. Yellow indicates proteins that provide connections to the three previously published C. elegans interactome maps [13, 19, 20]. Orange indicates a protein that was characterized previously and is connected to the vulval development interactome map [13].

scale comparisons of potential protein interactions with more specific phenotypes should be valuable for C. elegans. An example of a more specific phenotype potentially related to germline biology is the “high incidence of males” (Him) phenotype. Although only a few gORFs confer a Him phenotype by RNAi [34], we did observe one “intra-Him interaction” (F57C9.5/F41H10.10). While with only one such interaction it was not possible to assess the statistical significance of this finding, we suggest that F57C9.5/F41H10.10 has a high likelihood of relevance (see below). Given the expected pleiotropic nature of the proteins involved in the interactions described above, we investigated how large numbers of defined cellular phenotypes, rather than a few organismal phenotypes, could also be used to improve interactome maps. A total of 47 phenotypic descriptors were analyzed by time-lapse microscopy following RNAi of each one of 161 EmbgORFs [34]. We attempted to overlap the interactome map with the phenotypic clusters, or “phenoclusters” [20], obtained by grouping such descriptors by virtue of their level of similarity [34]. Ten of the 52 “Emb-gORFs” corresponding to proteins involved in heterodimeric interactions in the core germline interactome map were analyzed at this level (Table 2, DIC), and for 4 heterodimeric interactions, such detailed phenotypic information is available for both partners. Strikingly, 2 potential interactions involve protein pairs that show 4 overlapping embryonic phenotypes out of 47 (Y43E12A.1/PUF-3 and PUF-3/CYB-1) (Figure 2A) (see below). The probability of these four phenotypic descriptors overlapping by chance is very small (p ⬍ 0.001, see the Experimental Procedures). Also, it is important to note that, at this level of phenotypic resolution, a 100% overlap may not be expected because of potential pleiotropy of gene function. Because DIC data are available for only a few interaction pairs, it is difficult at this point to address the statistical significance of the overlap between proteinprotein interactions and phenoclusters. However, we

believe that interactions between proteins that share similar RNAi phenotypes are more likely to be relevant to germline biology than those that do not. Taken together, these data suggest that overlapping interactome data with phenome data at different levels of phenotypic resolution should help with the interpretation of increasingly large interactome data sets. Pairs of yeast genes encoding interacting proteins show a high likelihood of being coexpressed under defined experimental conditions, such as in response to stress or during the cell cycle [6–8]. To test if this type of correlation can be used for multicellular organisms and to assess whether it could be extended to multiple, combined expression profiles, the germline interactome map was correlated with a C. elegans transcriptome map obtained by measuring the expression of most worm genes across 553 different experiments [11] (Figure 2B). In the resulting “topomap”, expression clusters are visualized as “mountains” [11]. First, we assessed the number of heterotypic interactions observed between the products of gORFs that belong to common mountains (“intra-mountain” interactions). A uniform distribution would predict 23% intra-mountain interactions; however, the observed percentage is 50% (21/42) (see the Experimental Procedures) (Figure 3). Thus, there is a significant enrichment for intra-mountain interactions in the core germline interactome map (p value ⫽ 1.2 ⫻ 10⫺4). We then considered each individual mountain. Strikingly, 19 intra-mountain interactions reside in mountain 07, whereas a uniform distribution would have predicted 4.4 interactions (p value ⫽ 2.2 ⫻ 10⫺8). Since mountain 07 is enriched for genes involved in meiosis and mitosis [11], these interactions may be involved in these processes (see below). For the genes corresponding to two protein-protein interactions (Y43E12A.1/ PUF-3 and PUF-3/CYB-1), the expression profiles of each of the 553 experiments [9] were aligned. This alignment shows striking similarity between the gene pairs and indicates that they were coexpressed under dis-

Brief Communication 1955

Figure 2. Integration of Interactome, Phenome, and Transcriptome Maps The 42 distinct heterodimers are shown. (A) Integration of interactome and phenome data. The germline interactome map was combined with a phenome map obtained by RNAi [34]. The phenotypes conferred by RNAi for genes encoding interacting proteins are indicated in red (Emb), green (Him), yellow (Emb and Him), and gray (no embryonic lethality detected). For 161 Emb-ORFs with sufficient penetrance of the lethal phenotype, specific early embryonic phenotypes were examined by time-lapse microscopy [34] and resulted in a string of 47 phenotypic descriptors in which 1 indicates a deviation from wild-type, 0 indicates wild-type phenotype, and n indicates “not determined”. Two protein-protein interactions (Y43E12A.1/PUF-3 and PUF-3/CYB-1) showed some phenotypic overlap, indicating that these interactions may have a relatively high likelihood of biological relevance. (B) Integration of interactome and transcriptome data. The germline interactome map was combined with a transcriptome map obtained by combining 553 microarray experiments [11]. In the resulting topomap, genes with similar expression profiles are clustered and grouped into mountains [11]. The topomap location for each ORF corresponding to the proteins in the interactome map is indicated in different colors, each corresponding to a separate mountain. For one ORF, the topomap location could not be found (NF). For three gORFs, engaged in two protein-protein interactions, the individual expression profiles were aligned (upper two bars). Green indicates lower expression, and red indicates higher expression compared to control [11]. For each pair, the expression profiles were overlapped, and similar over- or underexpression relative to control (2-fold or higher) is indicated in yellow (lower bar). Opposite over- or underexpression (2-fold or higher) is indicated in blue. Details about aligning expression profiles of genes corresponding to yeast two-hybrid interacting proteins will be described elsewhere (N.B. et al., unpublished data). (C) Combined integration of interactome, phenome, and transcriptome data. Red lines indicate two-hybrid interactions between proteins for which each partner has a similar RNAi phenotype and a similar topomap location.

Current Biology 1956

Figure 3. Correlation of Interactome Data with Phenome and Transcriptome Data The percentage of expected versus observed heterodimers involving ORFs that share similar RNAi phenotypes (phenome), topomap location (transcriptome), or both (phenome ⫹ transcriptome) is indicated (see the Experimental Procedures for details).

crete experimental conditions (Figure 2B). Taken together, it is likely that C. elegans interactome maps will be improved by the integration with transcriptome mapping data (see below). Finally, we tested to what extent the core germline interactome map could be simultaneously combined with both phenome and transcriptome maps (Figure 2C). A uniform distribution would predict ⵑ4% interactions for which both partners confer either an Emb and/ or Him phenotype and belong to a common mountain (“intra-Emb/Him::intra-mountain interactions”) (see the Experimental Procedures). However, the observed percentage of such intra-Emb/Him::intra-mountain interactions is ⵑ24% (10/42), which is a significant enrichment (p value ⫽ 4.7 ⫻ 10⫺6) (Figure 3). We propose that these interactions have the highest likelihood of relevance to germline biology. Two examples illustrate the value of integrating different functional genomic data. Firstly, the interaction between F57C9.5 and F41H10.10 is likely to be involved in germline biology since both corresponding genes confer a Him phenotype by RNAi, which points to nondisjunction of the X chromosome during meiosis (Figure 2A). The hypothesis that this interaction is important for meiosis is strengthened by the observation that both ORFs are found in topomap mountain 07, which is enriched for mitotic and meiotic genes [11] (Figures 2B and C). In addition, both protein products are predicted to contain a HORMA domain [26], which is present in several other meiotic proteins such as HIM-3 [27]. Secondly, the potential Y43E12A.1/PUF-3 and PUF-3/ CYB-1 interactions are intriguing since the corresponding gene pairs confer an Emb phenotype by RNAi, share several phenotypic descriptors, and have multiple similar expression profiles (Figure 2). Y43E12A.1 and cyb-1 both encode homologs of the cell cycle regulator cyclin B, and PUF-3 is one of 11 predicted pumilio repeat proteins in C. elegans [28] (Table 2). PUF proteins have been shown to regulate mRNA stability in several model organisms [28]. Intriguingly, cyclin B mRNA stability is regulated by PUF proteins in Xenopus and Drosophila [28, 29]. The similarities of the discrete RNAi phenotypes and expression profiles displayed by these genes support the idea of a functional link between them in the early C. elegans embryo. The two-hybrid interactions of

CYB-1 and Y43E12A.1 with PUF-3 further suggest that there may be alternative levels of cyclin B regulation by PUF proteins in the C. elegans germline. By combining interactome, phenome, and transcriptome data for germline-enriched C. elegans genes, we have identified ten potential protein-protein interactions for which both gORFs share either Emb or Him RNAi phenotypes and related expression profiles. We propose that these interactions have a high likelihood of relevance to germline biology and/or early embryogenesis. The work presented here indicates that the interpretation of interactome maps can be improved by the integration with phenome and transcriptome mapping data and, potentially, vice versa. Integration of functional genomic data should also be applicable to other tissues or biological processes in C. elegans. Finally, the availability of a near complete human genome sequence [30, 31] and the establishment of RNAi in mammalian cells [32] should facilitate similar strategies for human biology. Supplementary Material Supplementary Material including the Experimental Procedures is available at http://images.cellpress.com/supmat/supmatin.htm. Acknowledgments We thank J. Lamb and F. Roth for helpful discussions, J. Dekker and V. Rebel for critical reading of the manuscript, Corey McCowan for administrative support, the GenomeVision Service sequencing staff at Genome Therapeutics for their help, and an anonymous reviewer for substantial help in improving the manuscript. This work was supported by grants 5R01HG01715-02 (National Human Genome Research Institute and National Institute of General Medical Sciences), 7 R33 CA81658-02 (National Cancer Institute), and 232 (Merck Genome Research Institute) awarded to M.V. and by the National Center for Research Resources (S.K.K.). Received: June 24, 2002 Revised: July 24, 2002 Accepted: August 22, 2002 Published: November 19, 2002 References 1. Lockhart, D.J., and Winzeler, E.A. (2000). Genomics, gene expression and DNA arrays. Nature 405, 827–835. 2. Pandey, A., and Mann, M. (2000). Proteomics to study genes and genomes. Nature 405, 837–846.

Brief Communication 1957

Table 2. Germline Interactome ORFs ORF C05C10.5 C05C8.6 C07H6.5 C13F10.7 C16C8.11 C27A2.6 C27H6.2 C28D4.3 C36C9.1 C49F5.6 C50C3.8 C50E3.13 C50E3.4 C52E4.4 D1046.1 F02D10.6 F10D11.2 F11A3.2 F11E6.2 F26H11.4 F28D1.2 F30F8.3 F35G12.12 F39H11.1 F39H2.4 F41H10.10 F42H10.7 F46A9.4 F52C6.3 F54D10.7 F54D5.9 F57C9.5 H02I12.5 M04B2.3 M7.2 M7.5 R05D11.8 R06B9.3 R07B7.2 R11A5.2 T01B7.5 T05B9.1 T05H4.14 T07G12.11 T07G12.6 T09B4.2 T12E12.4 T14G10.6 T20G5.11 T22D1.10 T24D1.3 T26A5.7 T27F2.1 W03C9.7 W03D2.4 Y11D7A.12 Y11D7A.13 Y43E12A.1 Y45F10A.2 ZC168.4 ZC513.6 ZK1055.1 a

Gene

cgh-1

rpt-1

skr-2

klc-1

drp-1

set-1 mex-1 pcn-1

puf-3 cyb-1 dao-1 hcp-1

Topomap 07 07 07 02 07 07 05 07 07 07 11 07 07 20 05 07 07 05 07 07 12 07 02 18 07 07 02 07 02 07 07 07 07 11 07 02 02 11 07 11 05 07 11 02 02 07 05 07 07 NFa 07 11 02 07 11 07 07 07 07 07 07 11

Blast

Phenotype

BTB/POZ domain DEAD box, helicase ubiquitin domain DSH ATPase, helicase glutamine synthetase Ras GEF domain MATH domain, BTB domain

ATPase, 26S proteasome RNA binding none translation initiation factor 2 domain

PDZ

HORMA ES2 Skp1 ubiquitin domain ubiquitin domain F-box HORMA TFIIF domain TPR domain ThiF domain, E1 ligase

WD domain

dynamin domains TM domains RNA binding ATPase, helicase RING finger SET domain Skip domain Zn-fingers PCNA

cyclin B pumilio repeats cyclin B Zn-fingers coiled coil, ATPase

Emb, Him no Emb no no Emb Emb no no no no no Emb Emb no Emb no Emb no no Emb no no no Emb Him no Emb Emb no Emb Him Emb no Emb no no no no no no no Emb Emb no Emb Emb no no Emb Emb Emb Emb Emb Emb Emb no Emb Emb Emb no no

DIC Microscopy

yes

yes

yes

yes

yes yes yes

yes yes yes

Not found.

3. Walhout, A.J.M., and Vidal, M. (2001). Protein interaction maps for model organisms. Nat. Rev. Mol. Cell. Biol. 2, 55–62. 4. Marcotte, E.M., Pellegrini, M., Thompson, M.J., Yeates, T.O., and Eisenberg, D. (1999). A combined algorithm for genomewide prediction of protein function. Nature 402, 83–86.

5. Vidal, M. (2001). A biological atlas of functional maps. Cell 104, 333–339. 6. Ge, H., Liu, Z., Church, G., and Vidal, M. (2001). Correlation between transcriptome and interactome mapping data from Saccharomyces cerevisiae. Nat. Genet. 29, 482–486.

Current Biology 1958

7. Grigoriev, A. (2001). A relationship between gene expression and protein interactions on the proteome scale: analysis of the bacteriophage T7 and the yeast Saccharomyces cerevisiae. Nucleic Acid Res. 29, 3513–3519. 8. Jansen, R., Greenbaum, D., and Gerstein, M. (2002). Relating whole-genome expression data with protein-protein interactions. Genome Res. 12, 37–46. 9. Kemmeren, P., van Berkum, N.L., Vilo, J., Bijma, T., Donders, R., Brazma, A., and Holstege, F.C.P. (2002). Protein interaction verification and functional annotation by integrated analysis of genome-scale data. Mol. Cell 9, 1133–1143. 10. Reinke, V., Smith, H.E., Nance, J., Wang, J., Van Doren, C., Begley, R., Jones, S.J.M., Davis, E.B., Scherer, S., Ward, S., et al. (2000). A global profile of germline gene expression in C. elegans. Mol. Cell 6, 605–616. 11. Kim, S.K., Lund, J., Kiraly, M., Duke, K., Jiang, M., Stuart, J.M., Eizinger, A., Wylie, B.N., and Davidson, G.S. (2001). A gene expression map for Caenorhabditis elegans. Science 293, 2087– 2092. 12. Hartley, J.L., Temple, G.F., and Brasch, M.A. (2000). DNA cloning using in vitro site-specific recombination. Genome Res. 10, 1788–1795. 13. Walhout, A.J.M., Sordella, R., Lu, X., Hartley, J.L., Temple, G.F., Brasch, M.A., Thierry-Mieg, N., and Vidal, M. (2000). Protein interaction mapping in C. elegans using proteins involved in vulval development. Science 287, 116–122. 14. Walhout, A.J.M., Temple, G.F., Brasch, M.A., Hartley, J.L., Lorson, M.A., van den Heuvel, S., and Vidal, M. (2000). GATEWAY recombinational cloning: application to the cloning of large numbers of open reading frames or ORFeomes. Methods Enzymol. 328, 575–592. 15. Reboul, J., Vaglio, P., Tzellas, N., Thierry-Mieg, N., Moore, T., Jackson, C., Shin-i, T., Kohara, Y., Thierry-Mieg, D., ThierryMieg, J., et al. (2001). Open-reading frame sequence tags (OSTs) support the existence of at least 17,300 genes in C. elegans. Nat. Genet. 27, 1–5. 16. Fields, S., and Song, O. (1989). A novel genetic system to detect protein-protein interactions. Nature 340, 245–246. 17. Vidal, M. (1997). The reverse two-hybrid system. In The Yeast Two-Hybrid System, P. Bartels and S. Fields, eds. (New York: Oxford University Press), pp. 109–147. 18. Walhout, A.J.M., and Vidal, M. (2001). High-throughput yeast two-hybrid assays for large-scale protein interaction mapping. Methods 24, 297–306. 19. Davy, A., Bello, P., Thierry-Mieg, N., Vaglio, P., Hitti, J., Doucette-Stamm, L., Thierry-Mieg, D., Reboul, J., Boulton, S., Walhout, A.J., et al. (2001). A protein-protein interaction map of the Caenorhabditis elegans 26S proteasome. EMBO Rep. 2, 821–828. 20. Boulton, S.J., Gartner, A., Reboul, J., Vaglio, P., Dyson, N., Hill, D.E., and Vidal, M. (2002). Combined functional genomic maps of the C. elegans DNA damage response. Science 295, 127–131. 21. Constanzo, M.C., Hogan, J.D., Cusick, M.E., Davis, B.P., Fancher, A.M., Hodges, P.E., Kondu, P., Lengieza, C., LewSmith, J.E., Lingner, C., et al. (2000). The yeast proteome database (YPD) and Caenorhabditis elegans proteome database (WormPD): comprehensive resources for the organization and comparison of model organism protein information. Nucleic Acids Res. 28, 73–76. 22. Matthews, L.R., Vaglio, P., Reboul, J., Ge, H., Davis, B.P., Garrels, J., Vincent, S., and Vidal, M. (2001). Identification of potential interaction networks using sequence-based searches for conserved protein-protein interactions or “interologs”. Genome Res. 11, 2120–2126. 23. Church, D.L., Guan, K.L., and Lambie, E.J. (1995). Three genes of the MAP kinase cascade, mek-2, mpk-1/sur-1 and let-60 ras, are required for meiotic cell cycle progression in Caenorhabditis elegans. Development 121, 2525–2535. 24. Walhout, A.J.M., Boulton, S.J., and Vidal, M. (2000). Yeast twohybrid systems and protein interaction mapping projects for yeast and worm. Yeast 17, 88–94. 25. Fire, A., Xu, S., Montgomery, M.K., Kostas, S.A., Driver, S.E., and Mello, C.C. (1998). Potent and specific genetic interference

26.

27.

28.

29.

30.

31.

32.

33. 34.

by double-stranded RNA in Caenorhabditis elegans. Nature 391, 806–811. Aravind, L., and Koonin, E.V. (1998). The HORMA domain: a common structural denominator in mitotic checkpoints, chromosome synapsis and DNA repair. Trends Biochem. Sci. 23, 284–286. Zetka, M.C., Kawasaki, I., Strome, S., and Muller, F. (1999). Synapsis and chiasma formation in Caenorhabditis elegans require HIM-3, a meiotic chromosome core component that functions in chromosome segregation. Genes Dev. 13, 2258–2270. Wickens, M., Bernstein, D.S., Kimble, J., and Parker, R. (2002). A PUF family portrait: 3⬘UTR regulation as a way of life. Trends Genet. 18, 150–157. Nakahata, S., Katsu, Y., Mita, K., Inoue, K., Nagahama, Y., and Yamashita, M. (2001). Biochemical indentification of Xenopus pumilio as a sequence-specific cyclin B1 mRNA-binding protein that physically interacts with a nanos homolog, Xcat-2, and a cytoplasmic polyadenylation element-binding protein. J. Biol. Chem. 276, 20945–20953. Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., Fitzhugh, W., et al. (2001). Initial sequencing and analysis of the human genome. Nature 409, 860–921. Venter, J.C., Adams, M.D., Myers, E.W., Li, P.W., Mural, R.J., Sutton, G.G., Smith, H.O., Yandell, M., Evans, C.A., Holt, R.A, et al. (2001). The sequence of the human genome. Science 291, 1304–1351. Elbashir, S.M., Harborth, J., Lendeckel, W., Yalcin, A., Weber, K., and Tuschl, T. (2001). Duplexes of 21-nucleotide RNAs mediate RNA interference in cultured mammalian cells. Nature 411, 494–498. Schwikowski, B., Uetz, P., and Fields, S. (2000). A network of protein-protein interactions in yeast. Nat. Genet. 18, 1257–1261. Piano, F., Schetter, A.J., Morton, D.G., Gunsalus, K.C., Reinke, V., Kim, S.K., and Kemphues, K.J. (2002). Gene clustering based on RNAi phenotypes of ovary-enriched genes in C. elegans. Curr. Biol. 12, 1959–1964.