Patrick J. Fleming11 , Harvey B. Pollard& and A. Lee BurnsO. From the $Cardiorenal. Division ...... J., Rosenfield,. P., Kelly, T., and Tjian, R. (1987) Cell 48, 79-89.
THE JOURNAL. OF BIOLOGICAL. CHEMISTRY
Vol. 265, No. 25, Issue of September 5, pp. 14922%14931.1990 Printed in U.S. A.
Genomic Nucleolin
Organization Gene*
and Chromosomal
Localization
of the Human
(Received for publication, Meera Srivastava$$, and A. Lee BurnsO
0. Wesley
McBriden,
Patrick
J. Fleming11
, Harvey
March 29, 1990)
B. Pollard&
From the $Cardiorenal Division, Food and Drug Administration, the SLaboratory of Cell Biology and Genetics, National Institute of Diabetes and Digestive and Kidney Diseases, and the ?lLaboratory of Biochemistry, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892 and the ((Department of Biochemistry and Molecular Biology, Georgetown University School of Medicine, Washington, D.C. 20007
Nucleolin, a eukaryotic nucleolar phosphoprotein, is involved in the synthesis and maturation of ribosomes. To characterize the genomic organization and regulatory sequences of this gene, two overlapping X clones containing the human nucleolin gene plus flanking regions were isolated from a genomic library using human nucleolin cDNA. Southern blots of genomic DNA from human, several mammals, chicken, and yeast revealed that the nucleolin gene is well conserved across these species. The gene consists of 14 exons with 13 intervening sequences and spans approximately 11 kilobases of DNA. Analysis of the splice junctions indicated that the amino-terminal domain and the four RNA binding domains plus the nuclear localization signal are split into adjacent exons. Sequences from the 5’-flanking and the first intron contain a high content of GC residues which iS consistent with nucleolin being a “housekeeping” gene. Promoter elements include an atypical TATA box (GTTA), one CCAAT box much further from the initiation site, three reverse compliments of CCAAT (ATTGG), and two pyrimidine-rich nucleotide stretches. In addition, this region and the first intron contain numerous potential Spl, GCF, CRE-fos, GCN, AP-1, AP-2, UCE, and sequences similar to the glucocorticoid receptor binding site. The transcription start site was determined by primer extension and Sl nuclease mapping of RNA from human liver. One Kpn and three Ah repeats were found within two of the middle introns. The 3’-untranslated portion of the gene contains five homology blocks in a loo-base pair region that are highly conserved among human, mouse, and hamster genomes. Finally, we have determined that the human nucleolin gene is located on chromosome 2q12-qter and is present at one copy per haploid genome. A restriction fragment length polymorphism with EcoRI has been detected in the gene.
Ribosomal biogenesis involves the synthesis and maturation of preribosomal RNA molecules within the nucleolus of eukaryotes in association with transiently bound, specific proteins, such as nucleolin and RNA polymerase I. Nucleolin is a lOO-kDa phosphoprotein which is activated by selective * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. to
The nucleotide sequence(s) the GenBankTM/EMBL
505584.
reported in this paper Data Bank with
has been submitted
accession number(s)
proteolysis and binds to rDNA spacer regions and nascent rRNA in the nucleolus and remains attached to ribosomes during transport to the cytoplasm (l-6). Although the mechanism of nucleolin action has not yet been completely defined, recent analyses of rodent (7) and our human nucleolin cDNA sequences (8) have revealed the presence of three major types of domains in nucleolin. These may correlate with some of nucleolin’s several activities. First, the amino-terminal region contains four highly acidic, phosphorylated segments that can interact with histones and decondense chromatin by releasing histones (9, 10). These less well conserved and variable domains may be responsible for the observed ability of nucleolin to bind to rDNA spacer regions (11). Second, four well conserved RNA binding domains are probably the sequences responsible for nucleolin’s contact with rRNA (12-14). Finally, a polyglycine region is found at the C-terminal end and may be involved in protein-protein interactions (15). It is interesting to note that the synthesis of nucleolin is positively correlated with increased rates of cell division, and thus the amount of nucleolin is highest in tumor or other rapidly dividing cells (16). Inasmuch as ribosomal RNA (rRNA) synthesis is critical for the regulation of cell division and can itself be regulated by such diverse stimuli as nutrient starvation, stage of embryogenesis, nucleogenesis and viral infection (17-24), we have devoted considerable energy to understanding the structure and function of human nucleolin. To this end, we have recently reported the cloning of human nucleolin cDNA and showed that the predicted amino acid sequence of nucleolin contains a multiple domain structure and is similar in many respects to the hamster andxenopus nucleolins. Furthermore, in attempts to gain insight into the transcription of the nucleolin gene, we now report the complete sequence of the human nucleolin gene and the analysis of regulatory elements in the 5’-flanking sequences. We were particularly interested in the possible conservation of splice junctions in relation to the protein domains and also the similarity of transcriptional regulatory sequences in different species. In addition, we have determined that the nucleolin gene is located on chromosome 2q12-qter and is not syntenic with other known ribosomerelated genes. MATERIALS
Isolation
of Genomic
with inserts generated DNA in X Fix vector random primer-labeled lin cDNA (8). Thirty
14922
Clones-We
AND
METHODS
screened
a human
genomic
library
by partial Sau3AI digestion of lung fibroblast (Stratagene). The hybridization probe was a (25) EcoRI fragment of the full length nucleoclones were identified on the primary screen
Human
Nucleolin
Gene: Genomic Organization
and then probed with a synthetic 36-bp’ oligonucleotide from the 5’ noncoding region of the nucleolin cDNA (bases 44 to 59 in Ref. 8). Two of the thirty phage hybridized with the oligonucleotide and were ulaaue-nurified. The ohaee DNA were used for Southern analysis and hybridization with nine dyfferent labeled oligonucleotides corresponding to various regions of the cDNA. One clone (HG3) contains the entire nucleolin gene and flanking regions, while the other (HGlO) extends more in the 5’ direction. Nucleotide Sequence Analysis-Restriction endonuclease fragments from the HG3 clone (EcoRI, 3.5, 2.1, and 0.8 kb and X&I, 8.0, 2.8, 2.0, and 0.8 kb) and the HGlO clone (EcoRI, 0.8 kb) were subcloned into pGX2627 plasmid vector. These subclones were sequenced using either two primers homologous to sequences flanking the multiple cloning site of pGX2627 or 30 oligonucleotides based on cDNA or derived genomic sequences. The entire gene plus flanking regions were sequenced by the dideoxy method (26) using cloned T7 DNA polymerase (Sequenase, United States Biochemical Corp.). Oligonucleotides were synthesized in an Applied Biosystems DNA synthesizer model 380B. DNA sequences were compiled with the Microgenie Program (Beckman). The 5’-flanking sequences, first intron, and 3’-flanking sequences were analyzed for regulatory elements using a computer model. Genomic Southern Analysis-Human placental DNA (6 pg) was digested with several restriction enzymes, separated according to size in 0.8% agarose gel, electroblotted onto nylon membrane, and covalently cross-linked by uv irradiation (Clonetech). Another Southern blot with EcoRI-treated DNA from human placenta, rhesus monkey, Sprague-Dawley rat, BALB/c mouse, canine, bovine, rabbit, chicken, and yeast (Saccharomyces cereuisiae) was prepared in a similar manner. Both membranes were prehybridized and hybridized with 12Plabeled cDNA using conditions recommended by the manufacturer. After overnight hybridization at 65 “C, the membranes were washed twice with 2 X SSC (1 X SSC = 0.15 M sodium chloride, 0.015 M sodium citrate, pH 7.0) for 15 min at room temperature, followed by two washes each with 2 x SSC, 0.1% sodium dodecvl sulfate and with 0.1 X SSC, 0.1% sodium dodecyl sulfate for 30 mm at 65 “C. Membranes were briefly blot-dried and autoradiographed with two intensifying screens for 2 days. RNA Isolation-Total RNA was isolated from human liver using the procedure of Chirgwin et al. (27). Poly(A) was obtained by affinity chromatography on oligo(dT)-cellulose (28). Primer-Extension Analysis-Two synthetic oligonucleotides complementary to the mRNA near the 5’ end (rpHR1, GATGAGTCCAGAAGAAGCCAAGCGACGGCGATGGCG; rpHR4, ACCTGCCTTCGCGAGCTTCACCATGATGGCGGCGGA) were end-labeled with [r-“‘P]ATP and T4 polynucleotide kinase. Primer extension analysis was performed by modification of the previously described method (29). 50 pg of RNA (total liver RNA + yeast tRNA) was lyophilized in a microfuge tube with: 400 mM NaCl. 40 mM Pines, PH 6.4. and 1 mM EDTA, pH 8.0. The DNA probe (100,000 cpm in 16;l) was added to 40 ~1 deionized formamide, heated at 90 “C for 5 min, and cooled to 55 “C. After the RNA was resuspended in the probe solution, the mixture was incubated at 55 “C for 4-6 h. Samples were precipitated twice with ethanol and resuspended in 50 ~1 of buffer containing 50 mM Tris-HCl, pH 9.3, 75 mM KCl, 5 mM MgC&, 2 mM dithiothreitol, 0.5 mM four dNTPs, and 200 units of Maloney reverse transcriptase (Bethesda Research Laboratories). cDNA was svnthesized for 1 h at 37 “C followed by degradation of the RNA in NaOH, extraction of the samples with phenol/chloroform, and analysis with a polyacrylamide sequencing gel. SI Nuclease Mapping-S1 nuclease protection of the 5’ end of the nucleolin transcript was performed as previously described (30). Briefly, an end labeled oligonucleotide (1025 bp to 1114 bp, Fig. 2) was hybridized to liver RNA and then treated with Sl nuclease to digest single-stranded nucleic acids. After hybridization, 400 ~1 Sl buffer (0.25 M sodium chloride, 0.2 M sodium acetate, pH 4.5, 1.25 M zinc acetate) and 500 to 2000 units of Sl nuclease was added. Incubations were carried out for 2 h at 30 “C. The protected fragments were purified, separated on a 6% polyacrylamide sequencing gel, and detected by autoradiography. Chromosomal Localization-The procedure to map the human nucleolin gene is based on Southern analysis of DNAs isolated from somatic cell hybrids in which subsets of human chromosomes clonally exist in a background of either mouse or hamster chromosomes. Construction, karyotypic analysis of banded mitotic chromosomes, and electrophoretic analysis of human biochemical markers in these I The
abbreviations
used are: bp, base pair(s);
kb, kilobase
pair(s).
and Chromosomal
14923
Localization
hybrids and for the human, mouse, and Chinese hamster parental cells have been described (31-33). Southern hybridization and washings were performed under high stringency conditions which allow detection of