displays a high degree of similarity, ranging from 74% between mouse and sheep to 92% .... 2 x NaCl/Cit hybridization solution as described (Gearing et.
Eur. J. Biochem. 204,21- 30 (1992)
0FEBS 1992
Cross-species comparison of the sequence of the leukaemia inhibitory factor gene and its protein -
Tracy A. WILLSON, Donald METCALF and Nicholas M. GOUGH The Walter and Ehzd Hall Institute of Medical Research, Post Office, Royal Melbourne Hospital, Parkville, Victoria, Australia (Received July 22,1991) - EJB 91 0970
Leukaemia inhibitory factor (LIF) is a pleiotropic growth factor active in diverse cell systems in both the adult and the embryo. The LIF gene from a number of mammalian species is highly conserved. The ovine and porcine LIF genes were cloned, sequenced and compared to the previously published murine and human LIF gene sequences. While the coding regions of the LIF gene are highly conserved, the non-coding regions are largely non-conserved. In a region of z 340 bp, the 5' end of the translational initiation codon is highly conserved (84%). This region includes four conserved TATA boxes, two transcriptional start-sites identified in the murine gene and the minimal region required to function as the LIF promoter. A sequence in the murine gene adjacent to this highly conserved region which appears to contain a negative control element is, however, poorly conserved between the four species compared, except for a sequence of 16 conserved nucleotides. Within the largely non-conserved first intron, there is a block of x 150 nucleotides which is highly conserved between all four species ( x 72%). However, a sequence in intron 1 of the murine LIF gene which corresponds to an alternative exon of a putative variant LIF transcript is very poorly conserved between species, with only relics of this exon evident in the other three species. A comparison of the five LIF protein sequences available (murine, rat, human, ovine and porcine) revealed that the protein displays a high degree of similarity, ranging from 74% between mouse and sheep to 92% between rat and mouse. Several large blocks of absolutely conserved amino acid sequence were identified. The ovine LIF gene was modified to allow production of recombinant ovine LIF in yeast cells, which was shown to be biologically active on murine cells. Leukaemia inhibitory factor (LIF) is a secreted glycoprotein that was initially identified through its ability to induce macrophage differentiation in the murine myeloid leukaemic cell line M1 (Hilton et al., 1988a; Metcalf et al., 1988). However, since its purification (Hilton et al., 1988b) and cloning (Gearing et al., 1987; Gough et al., 1988), it has become apparent that LIF mediates a diverse range of biological actions (for reviews see Gough and Williams, 1989; Hilton and Gough, 1991). Perhaps the most intriguing biological effect attributed to LIF is its action on embryonic stem cells, where it displays an action opposite to that on M1 cells, suppressing the differentiation of embryonic stem cells in vitro (Williams et al., 1988; Smith et al., 1988). Embryonic stem cells maintained in purified recombinant LIF for multiple passages, maintain their pluripotential phenotype and can give rise to germline chimaeric mice (Pease and Williams, 1990). Furthermore, it has been possible to derive new embryonic stem cell lines de novo in the presence of purified recombinant LIF, without the use of feeder layers (Pease et al., 1990). LIF has also been shown to have a number of other actions. Within the haemopoietic system, it has recently been shown that in vitro LIF synergizes with interleukin 3 in stimulating the production of megakaryocyte colonies and in vivo, increases the number of megakaryocytes, megakaryocyte Correspondence to T. A. Willson, The Walter and Eliza Hall Institute, Post Office, Royal Melbourne Hospital, Parkvillc, Victoria 3050, Australia Abbreviutions. LIF, leukaemia inhibitory factor; PCR, polymerasc chain reaction.
progenitors and circulating platelets (Metcalf et al., 1990). LIF also plays a role in bone metabolism (Abe et al., 1986; Metcalf and Gearing, 1989; Reid et al., 1990);in the induction of acute-phase plasma-protein synthesis in hepatocytes (Baumann and Wong, 1989); in affecting the transmitter phenotype in sympathetic neurones (Yamamori et al., 1989); and in inhibiting the activity of lipoprotein lipase in adipocytes (Mori et al., 1989). In these actions, LIF has been equated with the following factors: differentiation inducing factor (DIF), hepatocyte stimulating factor I11 (HSF HI), cholinergic neuronal differentiation factor (CNDF) and melanoma-derived lipoprotein lipase inhibitor (MLPLI). To begin defining structurally and functionally important regions of the LIF gene and protein, a study was initiated to identify regions of conserved sequence in the LIF gene from several species. The potential widespread use of LIF in the generation of chimaeric animals using embryonic stem cells and in embryo transfer, were further catalysts for the cloning and sequencing of the ovine and porcine LIF genes. We report here the cloning, sequencing and expression of the ovine and porcine LIF genes. Comparison of these sequences with the human and murine LIF genes showed that the coding regions are highly conserved. While the non-coding regions are largely divergent, a number of highly conserved blocks of sequence are evident, which may represent cis-acting control elements. Alignment of human, murine, rat, ovine and porcine LIF amino acid sequences allowed the identification of several conserved structural features, which will be valuable for relating structural and functional regions within the LIF molecule.
22 MATERIALS AND METHODS Library screening Phage plaques representing Sau3A-partial ovine and porcine genomic libraries in the lambda phage vector EMBL-3 were grown at a density of approximately 50000 plaques/ 10-cm petri dish, transferred to quadruplicate nitrocellulose filters and pairs of filters hybridized in duplicate with human or murine LIF probes radiolabelled with 32P, by nick-translation. The murine LIF probe was the cDNA clone pLIFmutl (Gearing et al., 1987). The human LIF probe contained only coding-region sequences and was derived from a genomic DNA fragment by deletion of the intron between exons 2 and 3 (Gough et al., 1988). Filters were prehybridized in 6 x NaC1/ Cit (NaCl/Cit = 0.15 M NaCl, 0.015 M sodium citrate) containing 0.2% Ficoll, 0.2% polyvinylpyrollidone, 0.2% bovine serum albumin, 2 mM sodium pyrophosphate, 1 mM ATP and 50 pg/ml Escherichia coli tRNA, and hybridized for 16 h at 65°C in fresh 6 x NaCl/Cit hybridization solution containing 0.1% SDS and probe at 2 x lo6 cpm/ml. Filters were washed in 2xNaCl/Cit, 0.1% SDS at 65°C prior to autoradiography. After purification of positive plaques by rescreening at lower density, genomic DNA inserts were excised with SulI and subcloned into pEMBL8’ (Dente et al., 1983).
Fig. 1. Detection of the LIF gene in various mammalian species using a murine probe. BamHI digested genomic DNA from various mammalian species was clectrophoresed on a 0.8% agarose gel. The probe was a rnurine cDNA (pLIFmutl) used under conditions of hybridi7ation as described (see Materials and Methods for details). Thc positions of the HindIII-digested lambda DNA size markers are shown.
Southern blot analyses Aliquots of restriction-enzyme-digested genomic DNA (z30 pg) were electrophoresed in 0.8% agarose gels, blotted onto nitrocellulose and prehybridized and hybridized in 2 x NaCl/Cit hybridization solution as described (Gearing et al., 1987). The hybridization probe, a nick-translated murine LIF cDNA fragment [pLIF7.2b (Gearing et al., 1987)] was used at 2 x lo7 cpm/ml. Filters were washed in 2 x NaCl/Cit, 0.1 YOSDS at 65 ’C prior to autoradiography. DNA sequencing Nucleotide sequences were determined on double-stranded plasmid DNA and M13-derived single-stranded DNA by the chain termination method, using the modified T7 DNA polymerase (Sequenase, USB and Promega), and primers external and internal to subcloned fragments. Portions of the ovine LIF gene sequence were determined on a set of double-stranded plasmid DNA in which a fragment spanning the ovine LIF gene had been resected with exonuclease I11 using the ExoIII/ Mung Bean Deletion Kit (Stratagene). The nucleotide sequence presented was determined on both strands throughout.
specific primer [for the 5’ end, S’d-(ACGGATCCCCCCTTCCCATCAACCCC) and for the 3’ end, S’d(GCAAGCTTCTAGAAGGCCTGCGCCAA)], buffer as supplied in the GeneAmp kit (Cetus Corp.) and 1.5 U Taq polymerase (Cetus Corp.). Reaction conditions were 2min at 9 4 T , 2min at 60LC,3 min at 72°C for 35 cycles in a Perkin-Elmer-Cetus DNA thermal cycler. A portion of the PCR product was bluntend ligated into M13mp9 and the expected sequence of the modified ovine LIF coding region confirmed by nucleotide sequencing. The modified ovine LIF coding region (ovLIFmut1) was then excised from M13mp9 as a BumHI fragment and ligated into the yeast expression vector YEpsecl (Baldari et al., 1987) which was then used to transform yeast as described (Gough et al., 1988). Ovine LIF was produced by growing ura transformants in non-selective yeast medium containing galactose (see Gough et al., 1988). +
RESULTS AND DISCUSSION Detection of ovine and porcine LIF genes by hybridization with a murine probe
Modification of the ovine LIF gene and expression in yeast A 3-kb BarnHI fragment spanning the ovine LIF gene was subcloned into plasmid pEMBL8+ (Dente et al., 1983). In order to delete the intron between exons 2 and 3, in vitro mutagenesis was carried out as described (Gough et al., 1988), using the same intron-deleting oligonucleotide as was used to modify the human LIF gene (Gough et al., 1988). Using this oligonucleotide, two nucleotide alterations were introduced into the sheep LIF coding region which did not alter its coding capacity. The 5’ and 3’ ends of the intron-deleted ovine LIF gene were modified using the polymerase chain reaction (PCR). PCR was carried out on 1 ng target DNA in a 100 p1 reaction containing 200 pM of each dNTP, 1 pM of each
In order to look for counterparts of the murine and human LIF gene in other mammalian species (in particular the ovine and porcine homologues) and to gain preliminary structural information, genomic Southern hybridizations were performed. Southern blot analyses on BarnHI-digested genomic DNA from a number of different species were probed with a murine cDNA fragment corresponding to the murine LIF coding region. Under conditions of relatively high stringency (hybridization and washing at 65 ’C in 2 x NaCl/Cit) a unique sequence was detected in all of the eight species examined (Fig. 1). The putative LIF coding region in each species, like the murine and human LIF genes, appears to lie on a unique BarnH1 fragment. This DNA fragment ranges in size from
23 acids of the mature protein; and exon 3 codes for the Cterminal 137 residues and the 3' untranslated region. Across all four species, nucleotide sequence identity in the three exons is 71 %. Indeed, nucleotide sequence conservation is largely restricted to the coding regions of the LIF gene. This is in contrast to the functionally related interleukin 6 gene which displays its highest degree of interspecies similarity in the noncoding regions (Tanabe et al., 1988). Analysis of the promoter region of the LIF gene
Fig. 2. Southern blot analysis of the ovine LIF gene. Ovine genomic DNA ( S ) from two different sources was digested with EcoRI (lanes 1 and 2) and BamHI (lanes 3 and 4) and hybridized with a murine LIF cDNA probe as described in Materials and Methods. The righthand track (M) contains murine genomic DNA.
6.2 kb in the case of the monkey, to 2.6 kb for the porcine gene, including a 3.8-kb BumHI fragment seen in ovine DNA.
Cloning of the ovine and porcine LIF genes The ovine and porcine LIF genes were isolated from genomic DNA libraries using murine (Gearing et al., 1987) and human (Gough et al., 1988) LIF coding-region fragments as probes. Several candidate clones that hybridized to both the human and murine probes were isolated from each library. Southern blot analysis of BamHI-digested lambda DNA from each of the clones revealed that all of the sequences that hybridized to the murine LIF cDNA probe were contained within a 2.6-kb BamHI fragment in the porcine clones and a 3-kb BumHI fragment in the ovine clones. This later finding is in contrast to the Southern blot analysis of ovine genomic DNA (Fig. 1) which indicated that the ovine LIF gene lies on a 3.8-kb BumHI fragment. Southern blot analysis of two different sources of ovine genomic DNA (one of which was the DNA used in library construction) digested with either BamHI or EcoRI revealed no restriction fragment length polymorphisms (Fig. 2). It is likely therefore that the shorter BamHI fragment in the lambda genomic clone reflects the generation of an artifactual BamHI site from a Sau3A genomic site, during cloning. Structural organisation of the ovine and porcine LIF genes The nucleotide sequence of the ovine and porcine LIF genes was determined, and is displayed in Figs S 1 and S 2, respectively. Three blocks of sequence, bounded by consensus splice donor and acceptor sites, which are highly similar to the three exons in the human and murine LIF genes were identified. Exon 1 specifies the 5' untranslated region and the first six amino acids of the hydrophobic leader; exon 2 encodes the remainder of the leader sequence plus the first 53 amino
Comparison of the murine and human LIE' genes previously revealed a number of highly conserved sequences occurring in the non-coding regions (Stahl et al., 1990). Several of these are also found in the ovine and porcine LIF genes. Alignment of the sequence at the 5' end of the murine, human, ovine and to a lesser extent the porcine, LIF genes revealed an 84% sequence similarity over a region of approximately 340 bp 5' of the translational initiation codon (Fig. S3). All four of the TATA-like elements previously identified in the mouse and human genes (Stahl et al., 1990) are conserved, yet only the second and third TATA boxes have been shown to be functional in the murine LIF gene (Stah1 et al., 1990). In addition, the major and minor transcriptional start sites (Stahl et al., 1990) are surrounded by large blocks of conserved sequence. Indeed, the region defined as the essential promoter region (- 103 to 1, Stahl, J. and Gough, N. M., unpublished results) is totally conserved between the murine, human, ovine and porcine LIF genes, except for an insertion just downstream of the third TATA box and a single nucleotide difference near the first TATA box. This region includes the sequence elements between - 103 to - 81, which have been shown, to be critical for the functioning of the LIF promoter (Stahl, J. and Gough, N. M., unpublished results). Interestingly, apart from a few simple base substitutions, sequence variation between species in the LIF promoter region appears to be the result of small insertions or deletions. Sequence divergence resulting from deletions in non-coding sequences has been well documented (Efstratiadis, 1980). The 'slipped-mispairing' model proposes that deletions occur by inappropriate pairing of short direct repeats (2 - 8 bp) during DNA replication (Efstratiadis, 1980). Based on this model, small direct repeats are located near the endpoints of the deletion. Such short direct repeats are in fact found near the ends of each of the deletions in the LIF 5' region (Fig. S3). It is also noteworthy that the murine, human and ovine promoter regions all harbour consensus AP2 nuclear-factorbinding sequences (CCTCCC) (Johnson et al., 1988; Eaton et al., 1987), but only the murine sequence contains an AP-1like binding domain (GGGCGG) (Dynan and Tjian, 1983). A candidate negative control element has been identified in the 5' region of the murine LIF gene between -360 to -249 (Stahl, J. and Gough, N. M., unpublished results). Surprisingly, the DNA sequence between positions - 249 to -360 is poorly conserved between the murine, human and ovine LIF genes, although the nucleotides between positions - 272 to - 257 are almost identical, as shown in Fig. S 3. Whether this region of 16 nucleotides constitutes the putative negative control element remains to be elucidated.
+
Analysis of the first intron of the LIF gene It was of particular interest to analyse the DNA sequence of the first intron of the LIF gene from other mecies. Comoarison of the murine and human LIF genes had ireviously sh'own Y
24 Table 1. Cross-species homology of LIF proteins. The percent identity of amino acid sequences for the mature LIF proteins from five species are given in pair-wise fashion.
Species
Murine
Rat
Human
92 100
79 81 100
Ovine
Porcine
%
Murine Rat Human Ovine Porcine
100
74 15
88 100
78 78 87
84 100
a highly conserved region of DNA sequence, embedded in the largely non-conserved non-coding region of the gene (Stahl et al., 1990). Conservation of this region of sequence may be accidental, or may be of functional significance. In the latter case, one would predict that this region would also be found in the LIF gene of other species. In fact, alignment of the nucleotide sequence of the relevant region of the murine, human, ovine and porcine LIF genes (Fig. S4) reveals a block of some 150 nucleotides which is 72% conserved between all four species. Such strong interspecies similarity makes this region a strong candidate for a regulatory element for LIF expression. Indeed the interleukin 6 and GM-CSF genes (Tanabe et al., 1988; Miyatake et al., 1985) also have highly conserved sequences in the introns, which have been taken to represent cis-acting control elements. The actual role played by this putative element in the function of the LIF gene has yet to be elucidated, but its identification allows specific experiments to be designed. A DNA sequence (41 bp) that corresponds to the first exon of a variant LIF transcript (Rathjen et al., 1990) is located within the first intron of the murine LIF gene. The DNA sequence of exon 1b of this variant LIF cDNA clone is identical to nucleotides 1442-1482 in the first intron of the murine LIF gene sequence (Fig. S5) (Stahl et al., 1990). Intriguingly however, this sequence is very poorly conserved in the human, ovine or porcine LIF genes. An obvious vestige of this exon is found in comparable regions of the first intron of the human and porcine LIF genes (Fig. S 5). However, these sequences could not encode the same amino acid sequence as in the variant murine exon l b and lack initiation codons. Moreover, only the porcine sequence contains a consensus splice donor sequence. The situation in the ovine gene is even more intriguing since an obvious homologue of this exon could not be found. The DNA segment with closest sequence homology was found considerably further downstream in intron 1 in the ovine gene and exhibits a low level of similarity (Fig. S5). Once again this exon lacks an initiation codon and a consensus splice donor sequence. These findings perhaps call into question the significance of the variant rnurine transcript, which has been suggested to encode a matrix-associated form of LIF. It is conceivable that this variant exon is generated in the murine gene by adventitious transcription of the intron region across a region which contains a fortuitous splice donor site. Comparison of LIF proteins
The amino acid sequence of the ovine and porcine L l F proteins, deduced from the nucleotide sequence of the corre-
sponding genes (Figs S 1 and S2), show no insertions or deletions in the mature protein, which further highlights the highly conserved nature of this molecule. In addition, the sequence of rat LIF has been reported (Yamamori et al., 1989) and predictably is also highly similar to all the other LIF molecules. Inspection of the LIF protein sequence alignment reveals that roughly the N-terminal half of the mature protein shows the highest conservation (Fig. S6), thus implicating this region as a key structural or functional domain. The amino acid identity around the regions corresponding to splice sites in the mRNA is also highly conserved. Table 1 contains a crossspecies comparison of LIF protein sequences based on percentage identity. Such a comparison reveals that the ovine and porcine LIF proteins are more closely related to human LIF than murine LIF, having 93% and 85% amino acid identity, respectively, with human, and 72% and 76% identity, respectively, with mouse. Similarly, amino acid sequence comparison of the bovine, human and murine GM-CSF proteins revealed that bovine GM-CSF is more closely related to human GMCSF (71% similar) than murine GM-CSF (56% similar) (Leong et al., 1989). Analysis of amino acid sequence identity across all five species of LIF shows that 65% (118/180) of amino acid residues are totally conserved and 8% (14/180) of amino acid residues have undergone conservative substitutions. Thus there is a high degree of similarity (73%) between the LIF molecules from these five species. Alignment of the human, murine, rat, ovine and porcine amino acid sequences (Fig. S6) reveals some striking regions of amino acid conservation between the different LIF proteins. The hydrophobic leader sequences are highly conserved, showing 83% amino acid identity across all five species. Interestingly however, murine LIF contains an insertion of a leucine residue (at position 10) in the hydrophobic leader with respect to the other four sequences, including rat. Sequence analysis of a number of independent murine cDNA and genomic clones has confirmed the authenticity of this insertion. Thus it would appear that this represents a recent evolutionary event, since it is lacking in the rat. All of the six cysteine residues found in the mature murine LIF protein are conserved in all five species and moreover, their positions are identical in each protein. This suggests that intramolecular disulphide bridges play a vital structural role in the integrity and activity of the LIF molecule. Interestingly, rat LIF has an extra cysteine at residue 146. Six out of seven potential Nlinked glycosylation sites are absolutely maintained, in identical positions (the seventh site is conserved in all species except pig), suggesting that glycosylation of the molecule is important. Intriguingly, although native LIF is indeed heavily glycosylated (Hilton et al.. 1988a), glycosylation is not actually required for biological activity, since E. coli-derived recombinant human and murine LIF are highly active (Gearing et al., 1989). Finally, the disposition of charged residues is quite similar in the five sequences. These results collectively suggest that the LlF molecules from these five different mammalian species share many similar structural elements. Expression of ovine LIF in yeast
Production of biologically active recombinant murine and human LIF in yeast has been described previously (Gearing et al., 1987; Cough et al., 1988) using the expression vector YEpsecl (Baldari et al., 1987). In order to express ovine LIF in yeast using the YEpsecl vector, it was necessary to make a number of modifications to the cloned gene. First, the intron
25 between exons 2 and 3 was deleted by oligonucleotide-directed mutagenesis, thereby fusing the two exons in-frame. Secondly, to allow in-frame insertion with the Kluyveromyces lactis leader sequence of YEpsecl, a BumHI restriction site was introduced at the presume started site of the sequence encoding the mature protein. This was accomplished using the PCR procedure, which also inserted a BurnHI site immediately after the stop codon. The resulting PCR product was then blunt-end cloned into Ml3mp9 and the expected sequence confirmed. The modified ovine LIF gene (ovLIFmut1) was then excised from M13mp9 as a BumHI fragment and inserted into YEpsecl . After introducing the YEpsecl /ovLIF recombinant plasmid into yeast, transformed yeast clones were selected and grown in galactose-containing medium to induce the GALCYC promoter in YEpsecl. The growth medium from such cultures was then assayed for LIF activity, using the differentiation of M1 cells as a bioassay (Metcalf et al., 1988). Recombinant ovine LIF produced in such cultures was active in inducing differentiation in murine M1 colonies, with titration curves typical of those for recombinant murine LIF. Two independent yeast clones transformed with a YEpsecl construct with the ovine LIF coding region in the correct orientation produced LIF to levels of 1.6 x lo4 and 8.3 x lo4 U/ml, while a control clone transformed with a construct with the ovine LIF coding region in the incorrect orientation produced no detectable LIF. Thus, like human LIF with which is bears significant sequence similarity, ovine LIF exhibits crossspecies activity on murine cells, further underscoring the conserved nature of the molecule. Concluding remarks
The studies reported in this paper, taken together with data published elsewhere, collectively reveal that the LIF protein is a highly conserved molecule, one of the most highly conserved of the haemopoietic cytokines, suggesting it to be of fundamental importance. Indeed, the conservation of the LIF gene sequence is further highlighted by the ability to detect a LIF gene homologue in various marsupial species by cross-hybridization to the murine LIF cDNA (Spencer, J., Watson, J. and Gough, N. M., unpublished results). Although originally purified and cloned on the basis of its differentiation-inducing activity on Ml myeloid leukaemic cells, LIF bas now been shown to have practical applications in other areas. The action of LIF on embryonic stem cells, which represent an important route for the generation of transgenic animals (Capecchi, 1989), suggests that LIF may play a vital role in the establishment of transgenic animals and in gene-targeting experiments (Capecchi, 1989). Such procedures may well extend beyond murine experiments to include commercially important livestock species, such as cows, sheep and pigs. In addition, LIF appears to act in vitro on preimplantation embryos, to increase the development of such embryos and to increase their implantation rate (Robertson et al., 1990). Thus LIF may well become an important agent in various embryo manipulation procedures with a range of animal species. In this regard the provision of LIF from agriculturally relevant animal species will be invaluable. The studies reported in this manuscript have revealed a number of conserved regions of sequence within the LIF gene and protein. Although the functional relevance of these regions has yet to be fully defined, their identification will allow specific experiments to be designed and undertaken to directly determine their significance.
We are grateful to Anna Raiccvic for technical assistance, and Doug Hilton and Nic Nicola for their interest and discussions. This work was supported by AMRAD Corporation (Melbourne), The National Health and Medical Research Council (Canberra), National Institutes of Health (Bethesda) Grant CA22556 and thc Anti-Cancer Council of Victoria.
REFERENCES Abe, E.,Tanaka, H., Ishimi, Y., Miyama, C., Hayashi, T., Nagasawa, H., Tomida, M., Ydmaguchi, Y., Hozumi, M. & Suda, T. (1986) Proc. Natl Acad. Sci. USA 83, 5958- 5962. Baldari, C., Murray, J. A. H., Ghiara, P., Cesareni, Ci. & Galeotti, C. L. (1987) EMBO J . 6,229-234. Baumann, H. & Wong, G. G. (1989) J . lmmunol. 143,1163- 1167. Capecchi, M. R. (1989) Science244, 1288-1292. Dente, L., Cesareni, G. & Cortese, R. (1983) Nucleic Acids Res. 11, 1645-1655. Dynan, W. S. & Tjian, R. (1983) Cell 35, 79-86. Eaton, S. & Calame, K. (1987) Proc. Natl Acud. Sci. U S A 84,76347638. Efstratiadis, A., Posakony, J . W., Maniatis, T., Law, R . M . , O’Connell, C., Spritz, R. A., DeRiel, J. K., Forget, B. G., Weissman, S. M., Slightom, J. L., Blechl, A. E., Smithies, O., Baralle, F. E., Shoulders, C. C. & Proudfoot, N. J. (1980) Cc// 21,653-668. Gearing, D. P., Gough, N. M., King, J. A., Hilton, D. J., Nicobd, N. A., Simpson, R . J., Nice, E. C., Kelso, A. & Metcalf, D. (1987) EMBO J . 6,3995-4002. Gearing, D. P., Nicola, N. A., Metcalf, D., Foote, S., Willson, T. A., Gough, N. M. &Williams, R. L. (1989) Biotechnology 7, 11571161. Gough, N. M., Gearing, D. P., King, J. A., Willson, T. A,, Hilton, D. J., Nicola, N. A. & Metcalf, D. (1988) Proc. Nut1 Acad. Sci. USA 85,2623 - 2621. Gough, N. M. &Williams, R. L. (1989) Cancer Cells 1, 77-80. Hilton, D. J., Nicola, N. A., Gough, N. M. & Metcalf, D. (1988a) J . Biol. Chem. 263,9238 - 9443. Hilton, D. J., Nicola, N. A. & Metcalf, D. (1988b) Anal. Biochem. 173,9238-9443. Hilton, D. J. & Gough, N. M. (1991) J . Cell. Biochem., in the press. Leong, S. R., Flaggs, G. M., Lawman, M. J. P. & Gray, P. W. (1989) Vet. Itnmun. Immunopath. 21, 261 -278. Johnson, A. C., Jinno, Y. & Merlino, G. T. (1988) Mol. Cell. Biol. 8, 4147-4184. Metcalf, D., Hilton, D. J. & Nicola, N. A. (1988) Leukemia 2, 216221. Metcalf, D. & Gearing, D. P. (1989) Proc. Natl Acad. Sci. USA 86, 5948 - 5952. Metcalf, D., Nicola, N. A. &Gearing, D. P. (1990) Blood 76,50-56. Miyatake, S., Otsuka, T., Yokota, T., Lce, F. & Arai, K. (1985) EMBO J . 4,2561 -2568. Mori, M., Yamaguchi, K. & Abe, K . (1989) Biochem. Biophys. Res. Commun. 160, 1085-1092. Pease, S . &Williams, R. L. (1990) Exp. Cell. Res. 190, 209-211. Pease, S., Braghetta, P., Gearing, D., Grail, D. & Williams, R. L. (1990) Drv. Biol. 141, 344-352. Rathjen, D. P., Toth, S., Willis, A., Heath, J. K. &Smith, A. G. (1990) Cell62,1105-1114. Rcid. 1. R., Lowe, C., Cornish, J., Skinner, S. J. M., Hilton, D. J., Willson, T. A., Gearing, D. P. & Martin, T. J. (1990) Endocrinology 126, 1416-1420. Robertson, S. A., Lavranos, T. C. & Seamark, R. F. (1990) in M o 1 ~ ular and cellular immunobiology of the maternal - foetal interfire (Wcgmann, T. G., Nisbett-Brown, E. & Gill, P. J., eds) pp. 191206, Oxford University Press, New York. Smith, A. G., Heath, J. K., Donaldson, D. D., Wong, G. G., Moreau, J., Stahl, M. & Rogers, D. (1988) Nature 336, 688 -690. Stahl, J., Gearing, D. P., Willson, T. A., Brown, M. A., King. J. A. & Gough, N. M. (1990) J. Biol. Chem. 265,8833-8841.
26 Tanabe, O., Akira, S., Kamiya, T.,Wong, G. G., Hirano, T.& Kishimoto,T.(1988) J . Immunol. 141, 3875-3881. Williams,R. L., Hilton, D.J., Pease,S., Willson, T.A., Stewart,C. L., Gearing,D.P.,Wagner. E. F., Metcalf, D., Nicola, N.A.& Gough,N.M.(1988) Nuture 336,684-687. Yamamori,T., Fukada,K.,Aebersold, R.,Fann,M.J. & Patterson,
Note added in proof. Examination of the nucleotide sequence of intron 1 of the rat LIF gene also shows very poor conservation of the putative first exon of the variant LIF transcript. In particular, there is no initiation codon in the analagous reading frame of the rat homologue of exon lb (Baffet, G.,Fletcher, R., Cui, M.-Z., Northemann,W.and Fey,G., personal communication).
P.(1989) Science 246, 1412-1416.
Supplementary material to : Cross-species comparison of the sequence of the leukaemia inhibitory factor gene and its protein Tracy A. WILLSON, Donald METCALF and Nicholas M.GOUGH
CTCGGCCTCAGCTGCCTGTAGTCTCTGACCACCTCCCCTAACTCAGGTGAGTCGGGGCCGTCTGGGGTGTCCACCATGTCTGGGCCTGGGGGATTGGAGG AACCCCCCGCTGACCCAGCCTGGGGGACCCCGCCCACTCTGGGCATCATCAAGTGAGTGCCAGGCAAGCCACTGGGGAAAACCACACAGAGCGTATGTTGAA
loo
CTTCCATT~T~TTTCCTATGATGCACCTCAAACAACTTCCTGGACTGGGGATCCCGGCTAAATATAGCTGTTTCTCTGTCTTACAACACAGGCTCCAG
200 300 400
Met TATATAAATCCGGCAIVLTTCCCCATTTGAGCATGAACCTACGGCCGGCATCTGAGGTCTCGTCCAAGGTCCTCTGGAGCACACAGCCCATGATC
500
GACTTCATTATMTTTTATCAAATTCTTCTTAGGGARGTCTGCTCTCCCCTCCCTCCCCCCTCTCTCGTCCCCCACCCCCCCACTCTCACTTTT
LysValLeuAlaAl8 AAGATCTTGCCCOCAGGTATGCGCCCGCCCCGCGCCGGCCGCGGCGGCCACTTGGGGCCCTGGGCAGCGCAGACTCCCGGACGCTCGGGGCGCCCGCC 600 GGGGG-TGGGTGGCGGCTGGGAGCCAGTGAGCGCAGGCACATGGTTCGGGCACCGTGCGCCCAGCTCGTCCCGGGCCAGCAGGTTGGCAGCAGGGAGGGG 700 GCCACCGCAGCCTTCCTCCTCCTCCTCTTCCTCCTCCTCCTCCTCCTCCTAGCTCCCGCCGCTCGCCCCGCTGCTCTTCCTTCTCTGCTCGCTCCCGGAT 800 TCACCCACCCTCTTTATTTTTCTTTCTTTCTTTCCGCTTTCTCTTTTCCAACCGCGTCGGGCTGCTCCCGGGGAGGGGCGCTGGCGGCCGAGCAGCCGAA 900 CAGCTCGCAAACTCCTGCCCGGCCCGCGGGGAGCAGGTGCCGCCTCCATCTGCCCGGTGCCTGAGCTAGGTGAGATGGGGGTGTCGGAGGAGGGGGGACC 1 0 0 0 CTCTCCCCTCCCCAGCCGTCTCCCGGCCCCAGCACTGCCTGCAGATGCTGGGAGC~CCGGGGAGCCCAAGAGAAGTGAACTTGCGGGGCTCCACCATTC1 1 0 0 GTCGCGC-GCTGCAGCCGGGATGGGGGCTCTGGGGTGGCTGACCCAGCCCCGACCACCCCCAGGTGCCCACTGCTGGGCCGTGGAAGGGGCCGTGGTCTGA 1200 GGGTGCAGGGGCGGGCGGGGAGGGGAGGGCCGCTCCTGGACAAGCCCGGACAGCTCCCAGCTCCGAGCACCTTTTCCCTGCCCTCCCTCCCCCCACTGGG 1 3 0 0
CTGCCTGCGACCTTTTCCCTCTTCTCTTTCCTGTTTGCACCATTTCCTCTCCCTCCCAATGCCGCCTGGGGGCAGGTTGGGTATGTGTGTGTGTGTGCTA 1 4 0 0 GTGTGTGGGACCCCCCCCCCCATTCTCCCTGCCCTAACCCAAGGGACTGGACAAGGGACAGTAGAGGAGC~GAGTTCGCTAGGGCCTGGACCAGGA1 5 0 0 TTTGGGGCTGGAGGGTGAGGAACCTTGGAGAGGGGTGTGGCCCCACTGG--AAGGACAGAGAGGGAGGAGGGAGGGAGGGTGGGCAGAATTAGGGGACTA 1 6 0 0 TCGGCCCAAGGAGTCAGAATCAAGCAGCAGCGGGCTTGGGAGGGAGAAGCAGGGAGACCCTAGGAGCGCCCTCCCCCCTTCCCCAGCGTGCCCCATTTGGATGAGG 1700 TGCACTGCCTGGGGCTGGTGATGGGACAG-AAGCTCGGGAATCGGGAATCCCAGGCATCCTGGGAGGGGGGAGA~~GGCCTGGGAGGGGTTGATGCCTTCCAGCCA 1800 GAGGCCAGGGGCTTTGACCTGTCTGGGGAAGGGGGTAGGGGTTGGATGTCTCCCTGCCCCTTGGCTCCTCCTCTTAACCCCAGAGACTGCAGAAGGAGAG 1900 GGGAAAGTGGAGAGAGAGAGAGAGAGAGAGAAGAGAGGCAGATGCTGAGACRAAGTGAAAACCCCACCCTGCCCAGGGGAAGACAGAGGGGGGTCAGGGA 2 0 0 0 ClyValV81ProLeuLeuCe G T C C C T C C C A C T G G C A T C C A G T G T G A C C C C C A A G C A C C C G T C C C A C C T C T G C G C T C A C G G C T C C T C C C T G C C T C T C C C C A G ~ ~ ~ T G ~2C1C0 ~ 0 ~~T
2200
2300 2400 2500 2600 2700
2800
2900
3000
3100
3200
Fig. S 1. Nueleotide sequence of the ovine LIF gene. The arrangement of the exons within the gene was determined by comparison to the murine and human LIF gene sequences (Stahl et al..1990). The coding regions of the three exons and the four TATA-likeelements in the 5’ region of
the translational start codon are in bold type. BumHl cloning sites are underlined.
27 GGATCCCTGCTAAATATAGCTGTTTCTCTCTCTGTCTTACAACACAGGCTCCAGTATATAAATCAGGCAAATTCCCCATTTGAGCATGAACCTCTGAAAA 100 M e t LysValLeuAlaAla CTGCCGGCATCTAAGGTCTCCTCCAAGGCCCTCTGGAGTTCAGCCCAT~T~CCTCTCTTGGCG~GTARATCCATCCGCCCCGCGCCGGCTTCCGCG2 0 0 CCCCGCTACGGGCCCAGCGGCAACGTGGGGCACTTGGGGCACTTGGCGATCGCGAGCGGGACACCCACCCGCTGCAGACACACGGACACTTGGGGCGCCCGCGCAGATG 3 0 0 CCAGGCAGCCGGAGGCTCTGGGTGGCTGCCGGGAGCGAGCGCAGGCACATGGTCCCCGCACCGCGCGCCCAGCCCGTCCGGGGCCCGCAAGTTGGCGGGA 400 CAGGGAGGGAGTGCTGCAGCCTTCCTCCTCCTCCTCCTCCTCCTTCTCCTCCTCCTCCTCCACGATCTTGCGCTCTCCCGACTACTTTTCCCGCTCTGCC 500 TTCTCCGGGATTCACCCTCCCTCTTTCTTTCTTTCTGTCTTTCCGCTTTCTTTTCC~CCGCGTCCCGGGCTGCTCCCTGGGAGGGGCGCCGGCCGCTGA 600 GCAGCTTGCAAACTCCGGCCCGGGACGGAGCAGGTGCCACCTCCATCTGCTAGTGCCTGGARAGCTGTGATCTGTGCTAGGTGAGCCCGGGGTGTGGGGG 700 GCCCCCCAGCCATCCTGGGGCCTCAGCACTGCCTGGGGATGCTGGGAGGCACAGGGGACCCAGAAGTGAACTTCAGCTGCGCACAGGGAATCAGCTTCCC 800 CACGCATGCCTCCTCCTTGCCCAGCCTCACCAGTTATCCAGCCTCGCAGGTCCGCTGACCCCAGCCCTTGCTGCTGGGTCACCTGCAGCTAGGGTTTT 900 GGGGCGCGGGTAGGAAGCAGGCGCCCTGAGGGAGGGGGAGGGGCGATCCTGGACAGGCCGGGACAGCTCCCAGCTCTCCTGTCACCTTTCACTTTCCTTC 1000 CTCCCCCGCCCACCTGGCAGCATGCGACCTTTTCCCTTTTTTTCTTTCCTGGTTGCACCATTTCCTTCCCCTCCTGAAGGCTCTGGGGGCCTGTGGGACC 1100 CCAGGATTCTCCTGGTCCARACGCTCCACTTGGTACAAGGACTAATAGGCAAGAGTGAGAGGGCATGGAGGCCAGGATCCCCACAGAGCTG--GGACAGG 1 2 0 0 AAGTTTTGAGGGATTATTTGGTCRAGGAGAGGAGAGGGGAATTGCAGAATTGGGGTGGAGTTGAGGGTCCTAGATAGGGGAGTGAGCGGCCACAGTGGGGGAAGG 1300 AGAAACACAGAGAAGATGGGCAAAGGAGAGATGGAliTTAGGGGACGAATGGTCAGTGAGCCARAATCATGTGCTAGGCTTGGGAGATGAGGGTGAGCAGAAC 1400 CCCCTCCCCCCTTCCCCATGGTCCCAGCTCGCTCTGGGAGACTCAGGTATGAAGTAACACTTAAGAATTTGGACCATCCTGGGAGGTGGGGAGATAGGGT 1500 CAGCTCTTCTCTCTGTGACCTGCTGGGCTTGGGGTGTGGTGCCCTCCAGCC~GGCCAGGATGGTTGTTGTCACCCTGGACCCAGCTTCCTACTGGTAC 1600 AAATAGCTCCTTTCCTGCCCCTTGCTACCAGAGGTAGAGAGTTAGCCATCTGGAGAGAGGGAGATAGAGGAGGAGAGGTGGTGACATCGTCAGGGCTGGG 1 7 0 0 GAAGAGAGAGCTGGAGGCTGAGAGGGGACACTGGTGTCCCTAAAGGAGACGAGGCGCATCCTATCCTGGGAGCCCCCTTGCCTACCCTTTCACCTCCCT~ 1800 GlyValValProLeuLeuLeuValLeuHisTrpLysHisGlyA1aGlySerProLeuSerIleThrProVa1A A T C A T G G C T C G T T T T T G C C T G T T T G C A G G A G T T G T T G T G C C C C T G C T G C T G G T T C T G ~ C T G G ~ C A C G G G G ~ ~ ~ G C C C C C T T T C ~ T ~ C1900 TCCTGT~
s~aThrCy.AlaThrArgHisProCysKisSerA~n~~et~nGlnIleLy~nGlnLe~laHisVal~nSerSerAl~~laLe~heIl ATGCCACCTGTGCCACACGTCACCCATCTCACACAG~CCT~T~CCAGAT~CCAGCT~CG~CGT~~~GTGCC~CGCCC 2 0T0C 0TTTAT
eLeuTyr TCTCTACGTAAGTTCACCCCCCTTGCCCGACCCCAGGGTGCTGAGGAAGGAAGGAGGGAGGGAGGGGCTGGGGTTTGCTGTGGGACCTGGGCTGGCGGGC TGGCTAGGGAAGGGAAGGGAGGGGGGCTCGGTCCTACCATGTGCAGAGTCCCACAGCTTCCGCCCCACTCCCCACCCTGGCCAAGGCTCGGTCACTTTCC GAGGAGAGAAGGAGGATGATGTAGGAGGACGGGGAGGTACTGTCGTCAGGGGAGACAGTGGGCTGTGGGTGTCTGAGGTGCAGACTGCCCAGGAAGAGGA TGAGTGCGGTGGGTGAAAGGGCAAGTGTGTGTGTGGTGCGCGCGTTGGAGGTGAGACAGTGTTTTGCTTTCTGTCTGCATGGGGTCCTGGTGGTGGTGATGA GGAGGATGCTGATCCTCCGCACGGCCCGGGTGTGCGTTCACACCCACAGTTACTTCTGGTTCTCAGGACGGTCCCRAGGAGAGCTGCCTTGGGGCAATTGAGTGG GTCATCGTGCCAGTTTGGGGGCAGTGGGGACCTGAGGCCAAGACCTGACCCAGAGGCTTGGAGGCAGCGCGGGAAGCCCTGTCCCTGACTCCATGTCACC
7100 2200 2400 2500 2600 2700
TyrThrAlrGlnGlyGluPrPheP~~sn~~spLy~Leu~~GlyPr~nValT~~~heProP~ T C C C C T C T G T G C C T C T G C C C C T C A G T A C A C A G C C C A C G C C C T T C C C G C C C
2800
PheHisAlaA~nGlyThrG1uLy~AlaRrg~uValGluLeu~rArgIleIl~aTyrLeuGlyAlaSerLeuClyAsnIleThrAr~pGlnArgS T T C C A C G C C A A C G C C A C C G A G T T G A A C G C C C C C C T C C T C C A C C 2900
erLeuA~nProGlyAl~ValAsnLeuHisSerLysLeuAPnAlaThrAlaAspSer~etArgGlyLeuLeuSer~nValLeuCysArgLeuCy~AsnLy C C C T C A A T C C T G G T G C T G T G C T G C A C T C ~ ~ ~ ~ ~ A C G G C G G A G T T ~ G ~ T G C ~ G G ~ T C ~ ~ ~ T G T G ~ 3C0 0T0~ C G C C T G T G ~ C ~ sTyrHisValAlaHisVa1AspVa1A1aTy~lyProAspThrSerGlyLysAspValPheGlnLyeLysLysLysLeuGlyCysGl~eu~uGlyLysTyr G T A C C A C G T G G C C C A T G T C G T ~ C C T A C G ~ C C C 3100 A C
Ly6GlnValIleSerV8~LeuAlaRrgAlaPhef** AACCAGGTCATCTCTGTGCT~CCCGCGCCTTCTTCT~TGGAAGGTCCCCTAGCACCCCGTGACCTGAGGTCTTAGACTTAGGTGACTCTCARACTGTGCCG GGGCCCAGAACATCACCAGACCCAAGTGGGGGTTGCTGACAGACCCGGGAGGGAGGGGGGCAGTTCTTAGCTGTCTCCTCTCCTCAGGGGTGGGCTGTGA CCATCACCACCTTGTTCCCTCAGTCAGAGTCTTCATGATCACATCACCCAAGTCATCTGCAGTGACCCTGACCATGGGGTGAGACAGCAGGAGTTGGAGG CATGTCCCAGCCCCCAGCAGAAGGACCACCACCTTCAGTGCCTTTGCTGCCCCTTAGGTAGACTTTGAAAGGTCTGGTTGGAACTCAGGCAGCGCAGAGGGGC TGGGATCC
3200 3300 3500 3600
3608
Fig. S2. Nucleotide sequcnce of the porcinc LIF gene. The arrangcment of the exons within the gcne was determined by comparison to the murinc and human LIF gene sequences (Stah1 et al., 1990). The coding regions of the threc exons and the two TATA-like elements in thc 5’ region of the translational start codon are in bold type. BarnHI cloning sites are underlined.
28 M CAGGCTGAGGACCCCTCTGAATCCCCCTGAGACCCTTTCCCCACCAGACCCATCTGCAAAAAACGCCCAGAGCARACCACT
* *
****
*
*
*
*
*
*** ***
-275
H CACTGCTGGGACCCCTGCTGACTCGGCCAGGGGCCCCCTCCTGGCGATGCCATCTTCAGACAACTCCCGGGACAAGCCAGG
****
*
*
*
t
*** ***
*
S ATTGGAGGAACCCCCCGCTGACCCAGCCTGGGGGACCCCGCCCACTCTGGGCATCATCAAGTGAGTGCCAGGCAAGCCACT -272
-257
M TAGGllAAACCACAGGGCGGTTTTTTGTTGTTGTTGAAGACTTCATTATMTTTTATCAATCARATTCTTAGAAGAAGGAAA
************ ***
** **** ***************
* *
.......................
-194
...TTTT. .....GTCGAAGGCTTCATTATAATTTTATCAATCAAATTCAATCAAATATTCTTAGAAGAGGGAAA * * ** **** *************** ....................... GGGGAAAACCACAGAGCG. ...TAT. .....GTTGAATCAAATGACTTCATTATMTTTT....ATCAAATTCTTAGAATCAAATGAGGGAAA
H CAGGAAAACCACAGGGCG **A*********
S
*t*
M AA.GTCTGCCCTCCCCACCCTCCCCCCTCACTCTTCCCCCCTCCCCCTTCACTCTCAC.TTTCTTCCATTCATM~TCCT -115
** *****
****** ************ *** * * * * * *
**
*********
*************I********
H A.GTCTGTTCTCCCCACCCTCCCCCCTCACTCGTCCCCCC....CCTTCACTCTCAC.TTTCTTCCATTCATMTTTCCT
** *****
* * * * * * ************ * * * * * * * * *
*.
********* . . . . . . . . . . . . . . . . . . . . . .
S AAAGTCTGCTCTCCCCTCCCTCCCCCCTCTCTCGTCCCCCACCCCCC..CACTCTCACTTTTCTTCCATTCATM~TCCT
M ATGATGCACCTCAAACAACTTCCTGGACTGGGGATCCCGGCT~TATAGCTGTTTCTCTCT..GTCTTACAACACAGGCT -36 * * * * * * * * * f * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ~ ~ * * * * * * * * * ~ * * * * ~ ~
*****************
S
...GTCTTACAACACAGGCT ATGATGCACCTCAAAC~CTTCCTGGACTGGGGATCCCGGCTAAATATAGCTGTTTCTCT....GTCTTACAACACAGGCT
P
GGATCCCGGCTAAATATAGCTGTTTCTCTCTCTGTCTTACAACACAGGCT
H ATGATGCACCTCAAACAACTTCCTGGACTGGGGATCCCGGCTAAATATAGCTGTTTCT... ...............................................................
...........................
*******ttt***t***
*************.***
+1
M CCAGTATATAAATCAGGCATTCCCCATTTGAGCATGRAACTGCCTGCATCT.AAGGTCTCCTCCAAGGCC
************** . . . . . . . . . . . . . . . . . . . . . . . . . .
* * * * * * * * * **********
* ******* *
*
+45
H CCAGTATATAAATCAGGCAATTCCCCATTTGAGCATGAACCTCTGAAAACTGCCGGCATCTGAGGTTT.CCTCCAAGGCC
************** . . . . . . . . . . . . . . . . . . . . . . . . . .
**+****** *** ******
*
* *******
S CCAGTATATAAATCCGGCAATTCCCCATTTGAGCATGAACCTCTGA~CGGCCGGCATCT.GAGGTCTCGTCCAAGGTC
************** . . . . . . . . . . . . . . . . . . . . . . . . . .
********* * * * ******
*
* *******
”
P CCAGTATATAAATCAGGCAATTCCCCATTTGAGCATGAACCTCTGA~CTGCCGGCATCT.AAGGTCTCCTCCAAGGCC Met
M CTCT..AGAGTCCAGCCCATAATG
****
******** ***
t67
H CTCT..GAAGTGCAGCCCATAATG
****
******** ***
S CTCTGGAGCACACAGCCCATGATG
****
******** ***
P CTCTGGAGTT..CAGCCCATAATG
Fig. S3. Alignment of the promoter and 5’ flanking region of the murine,human,ovine and porcine LIF genes. Nucleotide sequence identities are indicated by asterisks. Breaks introduced into the sequence to maximize similarity are indicated by dots.The major start-siteof transcription in the murine gene (Stahl et al.,1990) is denoted by + 1. TATA-likeelements are underlined and in bold type,while potential binding sites for transcription factors (AP-2-like,SP-1) and repeats of the hexanucleotide CCTCCC (see Results and Discussion) are underlined.The translational start codon is indicated in bold type. The nucleotide positions given relate to the murine LIF gene (Stahl et al.,1990) and are the positions referred to in the text.
29 M TCTTTCTTTTT CTTT....
*****
*******
CTTTCCGCTTTCTCTT~CAACTGCG.TCCCGGGCTGCTCCCTGGGAGGGGCGCAGGCGG
************** * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
....C* *T*T*T*C*C* G* *C*T*T*T* C* T*C*C* T* *T *T*C*C*A* *A*C C* G* C G*G* T* *C*C*G* C* *T *G*C T* G* *G*G*G*C* G* *C*G*G*G*C* G* *G ATTTTT.CTTTCTTT ....CTTTCCGCTTTCTCTTTTCCAACCGCGCG.TC..GGGCTGCTCCCGGGW\GGGGCGCTGGCGG * * * ******** ************* ************ ** *********** *********** *** *
1388/408
H TCTTTT.CTTTCTTT
1164/425
S
887/371
***
********
P TCTTT..CTTTCTTTCTGTCTTTCCGCTTTCT..TTTCCAACCGCG.TCCCGGGCTGCTCCCTGGGAGGGGCGCCGGCCG
M CTGAGC.
* ****
H C.GAGC.
* ****
596/428
....AGCTTGCAAACTCCGGCCAGGACCG.GCGCAGGTGCGGCTTCCGTCTGCTAGTCCCT ....GGAAAGCT ****************** ** * * * *** ** ** * ** ** ** * * * * ** ....AGCTTGCAAACTCCGGCCTGGGACGGG. ..AGCAGGTGCCGCCTCCATCTGCT.GGTGTCTGGAA.GC.
1458/478
t**
**** ********* *** **
*f
*
......................
**** ***
**
1232/493
S C C G A G C C G A A C A G C T C G C A A A C T C C T G C C C G G C C C ~ G G G G A ~ A G G T G C C G C C T C C A T C T ~ C G G T ~ T G . A . . G C T 967/451
****
P CTGAGC
**** ********+ ******* * * *
.....AGCTTGCAAACTCCGGCCCGGGACG.G.
......................
******* *
***
..AGCAGGTGCCACCTCCATCTGCTA.GTGCCTGGAAAGCT
666/498
Fig. S4. Alignment of conserved sequences from intron 1 of the murine, human, ovine and porcine LIF genes. Nucleotide sequence identities are indicated by asterisks. Breaks inserted into the sequence to maximize homology are indicated by dots. Short direct repeats flanking potential deletions are underlined. Two nucleotide positions are given to the right of the sequcnce line. The first relates to the overall gene sequcnce as given in Figs S 1 and S2 (ovinc and porcine) and Stahl et al. (1990) (murine and human). The second represents an absolute position within intron 1 of each sequence with the G of the splice donor site representing nucleotide 1 of the first intron.
MURINE cDNA :
MetArgCysArgIleValPro CTAGTCCCTGGAAAGCTGTGATTGGCGCGAGATGAGATGCAGGATTGTGCCC
MetArgCysArg
MURINE GENE : GTCTGCTAGTCCCTGGAAAGCTGTGATTGGCGCGAGATGATGC~~ATGGGTACCC1495/515 ValArgTyrArg
HUMAN GENE
: ATCTGCTGGTGTCTGGAAGCGTGTGGTCTGCGCTAGGTGAGATATAGGGGTGTGGCCCC 1271/532
O V I m GENE
:
ACTCGGGAATCCCAGGCATCCTGGGAGGGGGGAGATAAGGGGGTTGATGCC
1791/1275
PORCINE GENE:
Va 1SerP r o G 1y ATCTGCTAGTGCCTGGAAAGCTGTGATCTGTGCTAGGTGAGCCCGGGGTGTGGGGGGCC
703/503
LysAlaTrpGlu
Fig. S5. Comparison of exon 1 of a variant murine LIF cDNA with corresponding sequence from the human, ovine and porcine LIF genes. Nucleotide and amino acid sequence of a variant LIF cDNA clone (described by Rathjen et al., 1990) is given on the top line, with sequence of'the alternative exon 1 in bold letters. The nucleotide and deduced amino acid sequence of the DNA sequence in intron 1 of the murine gene corrcsponding to this alternative exon is given on thc second line. Regions within intron 1 of the human, ovine and porcine genes which havc similarity with this exon are given on lines 3, 4 and 5; similar residues are in bold. Two nucleotide positions are given to the right. The first relates to the overall gcne sequence as give in Figs S 1 and S2 (ovine and porcine) and Stahl et al. (1990) (murine and human). The second represents an absolute position within intron 1 of each sequence with the G of the splice donor site representing nucleotide 1 of the first intron (Stahl et al., 1990; Figs S 3 and 54).
corresponding to spliccjunctions in the mRNA are iiidic;itcd by triangles.
Fig. S6. Similarity comparison of the m u r k rat, human, ovine and porcine LIF protein sequence. Amino acid identitics arc indicated by asterisks and conservative substitutions (Arg/Lys: Glu/ Asp: Gln:Asn; Ile/Lcu/Val: ScrlThr) by dashes. Amino acid residues conserved across all five species are boxed. Potential N-linked glycosylation sites are indicated by hatches ( # # #). Positions o f conserved cysteinc residues are marked by asterisks above the murine sequence. while the non-conserved cystcine is indicated by a downward-facing arrowhead ( V ). Positions
P:
0:
H:
R:
M:
P:
0:
H:
R:
M: W
0