Oct 22, 1991 - Nucleotide Sequences of a Soybean cDNA Encoding an 18. Kilodalton Late Embryogenesis Abundant Protein'. Zuei-ying Chen, Yue-ie C.
Plant Physiol. (1992) 99, 773-774
Received for publication October 22, 1991 Accepted November 30, 1991
0032-0889/92/99/0773/02/$01 .00/0
Plant Gene Register
Nucleotide Sequences of a Soybean cDNA Encoding an 18 Kilodalton Late Embryogenesis Abundant Protein' Zuei-ying Chen, Yue-ie C. Hsing*, Pei-fang Lee, and Teh-yuan Chow Institute of Botany, Academia Sinica, Taipei, Taiwan, Republic of China We report the sequence of a cDNA clone, pGmPM 1, corresponding to a soybean mature seed-abundant mRNA (GmPM for Glycine max physiological mature; Table I). It was selected from a pod-dried cotyledon cDNA AZaplI library (3) screened by differential hybridization with single-stranded cDNA probes prepared from immature seeds (35 DAF) and pod-dried seeds poly(A+)2 RNA. Figure 1 shows the nucleotide sequences of the 836-bp cDNA insert and the derived amino acid sequence. Only one long open reading frame was found and the molecular mass of the deduced protein is similar to that predicted by hybrid select translation (3). The deduced protein is very hydrophilic, as are most other Lea proteins (2), and consists of 173 amino acid residues corresponding to a molecular mass of 18 kD. Two potential poly(A+) signals, AATAA and AATAAA, are found in the 3'-noncoding region, 27 and 19 nucleotides, respectively, upstream from the poly(A+) tail. A search for polypeptide homologies in databanks revealed a strong local similarity with Lea protein D1 13 from cotton (1). Recently, this gene was grouped into the LeaA gene and was demonstrated to be expressed at the late seed developmental stage as well as in mid-development seeds after excision and culture in basal medium or ABA treatment
2.
3. 4.
GAAAACCTAAGCTTAGAATTAGAAACAAAGAAAAGAAGAAAGGAAAGAATTGATCGTTGA AAATCAGGGAGGAAAGAAAGCGGGAGAGTCCATTAAGGAAACAGCTACAAACATTGGTG
121
CTTCTGCCAAGGCTGGCATGGAGAAGACCAAGGCCACTGTCCAAGAGAAGGCAGAAAGAA
181
TGACAGCACGTGATCCGGTGCAAAAGGAGTTGGCAACCCAAAAGAAAGAGGCAAAGATGA N
60
241
ACCAGGCGGAGCTGGACAAGCAGGCGGCGCGTCAGCATAACACGGCGGCCAAACAGTCCG Q A E L D K Q A A R Q H N T A A K Q S A
80
301
CCACCACGGCTGGGCACATGGGACATGGCCACCACACCACCGGAACCGGGACTGGAACCG T T A G H M G H G H H T T G T G T G T A
100
361
CCACGTATTCCACCACCGGGGAATATGGTCAGCCCATGGGGGCCCACCAGACGTCGGCAA
421
TGCCTGGCCATGGAACCGGGCAGCCCACGGGCCATGTCACCGAAGGAGTGGTGGGCTCAC
481
ATCCCATTGGAACCAACAGAGGGCCGGGTGGGACCGCTACAGCTCATAATACCCGTGCGG
541
GTGGAAAGCCGAATGATTATGGGTATGGGACTGGGGGTAC;I'A&TTAATGATGATTATAT
601
CATCAAGGTTTATTGGCAATAGTTTAATAATAAGGTTTATTAAATGTAGTTAGTTTGGGG
M S T
T
P
P G
Q A A
Y
G
I K
G K R
S
H G P
G
A D
T
G
T N
K G P
T
T N
D
K M V
G
G R Y
A E
Q
E Q G
G
G K K
Y P P Y
E
T E
G T G G
S
K L
Q G
G T
I A A
P H
T G
K T
T
M V A G
E V Q
G T
T
T Q K
A
E A
A E
K
H G H
T K
E
Q
V
N
N A
A
T V
T
I E
K
S G R
G
R M
A S A
A
M
M
H G
T
661
TATGGCTTTTGTACTTGTTTGTTTGGTTATCTTTGTTTGTCTGCTTTTCTCTCCACTTGT
721
TATGGAGAGACTTGGGTGTGGTTATGACAGTGTTTTCACTACGTGGAAACGTTATATGTG
781
GTAAACTTGTACTTGTATCATTTCGTGATAATAATTAA&TAATAGTGGTTGTGT
20 40
120 140 160 173
Figure 1. Nucleotide sequence of the 836-bp cDNA insert and the derived amino acid sequence of pGmPM1 from soybean. The start codon, stop codon, and potential poly(A+) sites are indicated by underlines.
(4). 1.
1 61
LITERATURE CITED Baker J, Steele C, Dure L III (1988) Sequence and characterization of 6 Lea proteins and their genes from cotton. Plant Mol Biol 11: 277-291 Dure L III, Crouch M, Harada J, Ho T-hD, Mundy J, Quatrano R, Thomas T, Sung ZR (1989) Common amino acid sequence domains among the LEA proteins of higher plants. Plant Mol Biol 12: 475-486 Hsing Y-iC, Wu S-j (1991) Cloning and characterization of cDNA clones encoding soybean seed maturation polypeptides. Bot Bull Academia Sinica 33: 191-199 Hughes DW, Galau GA (1991) Developmental and environmental induction of Lea and LeaA mRNAs and the postabscission program during embryo culture. Plant Cell 3: 605-618
'This research was supported in part by the National Science Council to Y.C.H. and T.C. 2 Abbreviations: poly(A+), polyadenylation; Lea, late embryogenesis abundant; bp, base pair. 773
774
CHEN ET AL.
Plant Physiol. Vol. 99, 1992
Table I. Characteristics of pGmPM1 Organism: Glycine max L., var Williams'82. Function: Encodes 1 8-kD soybean seed maturation polypeptides (3). Clone Type; Designation: cDNA, full length, pGmPM1. Source: cDNA library in AZAPII vector constructed from the poly(A+) RNA of 4-d pod-dried 35 DAF soybean cotyledons. Method of Isolation; Subsequent Identification: Differential hybridization with single-stranded cDNA probes prepared from fresh immature seeds (35 DAF) and 4-d pod-dried seed poly(A+) RNA. pGmPM1 showed a strong hybridization signal only with the homologous probe. Sequencing Strategy: Single-stranded templates; unidirectional deletion subcloning and complete dideoxy sequencing of both strands. Features of cDNA Structure: Transcript of approximately 900 nucleotides as detected on northern blots; this clone of 836 nucleotides contains 62 nucleotides 5'untranslated region, 519 nucleotides open reading frame, and 254 nucleotides 3'-untranslated region. Codon Usage: Codons not present in this cDNA: GTA(V), GTT(V), AGT(S), AGC(S), ATA(I), ATC(I), TAC(Y), TTA(L), TGG(W), TGT(C), TGC(C), TTT(F), TTC(F), CGG(R), CGA(R), CTA(L), CTT(L), CTC(L), and CCA(P). (G+C) Content: 47.80% along entire length; 56.65% in protein-coding region. Structural Features of Protein: Open reading frame of 173 amino acids, predicted molecular mass of the protein is 18 kD. There are no Cys or Trp residues in the protein. Antibody: Not available. GenBank Accession No.: M80666