GENOME ANNOUNCEMENT
Complete Genome Sequence of the Hyperthermophilic Archaeon Thermococcus sp. Strain CL1, Isolated from a Paralvinella sp. Polychaete Worm Collected from a Hydrothermal Vent Jong-Hyun Jung,a James F. Holden,b Dong-Ho Seo,a Kwan-Hwa Park,c Hakdong Shin,d Sangryeol Ryu,d Ju-Hoon Lee,a and Cheon-Seok Parka Department of Food Science and Biotechnology, Institute of Life Sciences and Resources, Kyung Hee University, Yongin, Republic of Koreaa; Department of Microbiology, University of Massachusetts, Amherst, Massachusetts, USAb; Department of Food Service Management and Nutrition, SangMyung University, Seoul, Republic of Koreac; and Department of Agricultural Biotechnology, Seoul National University, Seoul, Republic of Koread
Thermococcus sp. strain CL1 is a hyperthermophilic, anaerobic, and heterotrophic archaeon isolated from a Paralvinella sp. polychaete worm living on an active deep-sea hydrothermal sulfide chimney on the Cleft Segment of the Juan de Fuca Ridge. To further understand the distinct characteristics of this archaeon at the genome level, its genome was completely sequenced and analyzed. Here, we announce the complete genome sequence (1,950,313 bp) of Thermococcus sp. strain CL1, with a focus on H2and energy-producing capabilities and its amino acid biosynthesis and acquisition in an extreme habitat.
H
yperthermophilic archaea have unique genetic and metabolic features for growth in extreme environments; however, the diversity of these features among hyperthermophiles is poorly understood (6). Thermococcus sp. strain CL1 was isolated from a Paralvinella sp. polychaete worm collected from an active deepsea hydrothermal vent sulfide chimney (7). It grew more rapidly and over a wider temperature range than other Thermococcus species isolated from nonworm sources, produced a suite of proteases (7), and generated significant amounts of H2 even when grown on elemental sulfur (11). Genomic DNA from Thermococcus sp. strain CL1 was isolated as described previously (9) and sequenced completely using a GSFLX Titanium pyrosequencer (Macrogen, Seoul, South Korea). GeneMarkS (2), Glimmer 3.02 (3), and FgenesB (Softberry, Inc., Mount Kisco, NY) were used to predict the open reading frames (ORFs) present. Their functions were verified using BLASTP (1) and InterProScan (15). tRNAs and rRNAs were predicted using tRNAscan-SE and RNAmmer, respectively (8, 10). CRISPRFinder and SignalP were used to determine CRISPR repeats and extracellular proteins (5, 12). The complete genome of Thermococcus sp. strain CL1 consists of a circular chromosome of 1,950,313 bp containing 2,017 ORFs, 46 tRNAs, two 5S rRNA genes, one 16S rRNA gene, and one 23S rRNA gene with a GC content of 55.8%. The chromosome has three CRISPR-associated gene (cas) clusters and five CRISPR loci in the vicinity of the cas gene clusters, which likely defend the cell against viruses and mobile elements (14). Interestingly, CL1 also has flexible direct repeats of 16 nucleotides without any nonrepetitive spacers. Thermococcus sp. strain CL1 grows well at 85°C on peptides and elemental sulfur (7), and produce H2S and H2 in relatively equal proportions (11). The H2 is formed by a membrane hydrogenase complex and two soluble hydrogenases with concomitant ATP production by a membrane-bound ATP synthase (13). A KEGG pathway analysis revealed that it has no tricarboxylic acid (TCA) cycle, an incomplete pentose phosphate pathway, and no shikimate pathway, suggesting that this strain does not produce ␣-ketoglutarate and erythrose 4-phosphate as amino acid precursors. Therefore, CL1 probably does not produce Glu, Gln, Pro,
September 2012 Volume 194 Number 17
Arg, or aromatic amino acids (Phe, Tyr, and Trp) (16). To synthesize proteins with all required amino acids, CL1 should obtain these missing amino acids from other organisms (16). The peptides required for growth are transported across the membrane by dipeptide (Dpp)/oligopeptide (Opp) family ABC-type transporters (16). CL1 possesses the two gene clusters of the Dpp/Opp family and four Dpp/Opp family permeases. It also has at least five proteases that are similar to pyrolysin- or subtilisin-like serine proteases (4). CL1 likely obtains the peptides it needs for growth through its proximal association with the worm (7). The complete genome sequence of Thermococcus sp. strain CL1 provides insight into the organism’s peptide metabolism, energy generation, and metabolite production capabilities, which will aid in our understanding of the growth of this organism in extreme environments. Nucleotide sequence accession number. The final annotated genome sequence of Thermococcus sp. strain CL1 is now accessible in GenBank under accession number CP003651. ACKNOWLEDGMENTS This work was supported by a National Research Foundation of Korea (NRF) grant funded by the Korean government (MEST) (2011-0027299) and by grants from the Northeast Sun Grant Institute of Excellence (NE07-030 and NE11-26), USDA CSREES (MAS00945), and NSF (OCE0732611) to J.F.H.
REFERENCES 1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403– 410. 2. Besemer J, Lomsadze A, Borodovsky M. 2001. GeneMarkS: a selftraining method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions. Nucleic Acids Res. 29:2607–2618. 3. Delcher AL, Bratke KA, Powers EC, Salzberg SL. 2007. Identifying
Received 6 June 2012 Accepted 19 June 2012 Address correspondence to Ju-Hoon Lee,
[email protected], or Cheon-Seok Park,
[email protected]. Copyright © 2012, American Society for Microbiology. All Rights Reserved. doi:10.1128/JB.01016-12
Journal of Bacteriology
p. 4769 – 4770
jb.asm.org
4769
Genome Announcement
4.
5. 6. 7. 8. 9. 10.
bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics 23:673– 679. de Vos WM, et al. 2001. Purification, characterization, and molecular modeling of pyrolysin and other extracellular thermostable serine proteases from hyperthermophilic microorganisms. Methods Enzymol. 330: 383–393. Grissa I, Vergnaud G, Pourcel C. 2007. CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats. Nucleic Acids Res. 35:W52–57. Holden JF. 2009. Extremophiles: hot environments, p 127–146. In Schaechter M (ed), Encyclopedia of microbiology. Elsevier, Oxford, United Kingdom. Holden JF, et al. 2001. Diversity among three novel groups of hyperthermophilic deep-sea Thermococcus species from three sites in the northeastern Pacific Ocean. FEMS Microbiol. Ecol. 36:51– 60. Lagesen K, et al. 2007. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res. 35:3100 –3108. Lee JH, et al. 2008. Comparative genomic analysis of the gut bacterium Bifidobacterium longum reveals loci susceptible to deletion during pure culture growth. BMC Genomics 9:247. Lowe TM, Eddy SR. 1997. tRNAscan-SE: a program for improved detec-
4770
jb.asm.org
11.
12.
13.
14.
15.
16.
tion of transfer RNA genes in genomic sequences. Nucleic Acids Res. 25: 955–964. Oslowski DM, Jung JH, Seo DH, Park CS, Holden JF. 2011. Production of hydrogen from ␣-1,4- and -1,4-linked saccharides by marine hyperthermophilic archaea. Appl. Environ. Microbiol. 77:3169 –3173. Petersen TN, Brunak S, von Heijne G, Nielsen H. 2011. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat. Methods 8:785–786. Pisa KY, Huber H, Thomm M, Muller V. 2007. A sodium ion-dependent A1AO ATP synthase from the hyperthermophilic archaeon Pyrococcus furiosus. FEBS J. 274:3928 –3938. Portillo MC, Gonzalez JM. 2009. CRISPR elements in the Thermococcales: evidence for associated horizontal gene transfer in Pyrococcus furiosus. J. Appl. Genet. 50:421– 430. Zdobnov EM, Apweiler R. 2001. InterProScan-an integration platform for the signature-recognition methods in InterPro. Bioinformatics 17: 847– 848. Zivanovic Y, et al. 2009. Genome analysis and genome-wide proteomics of Thermococcus gammatolerans, the most radioresistant organism known amongst the Archaea. Genome Biol. 10:R70.
Journal of Bacteriology