PcarnBase: Development of a Transcriptomic ... - Springer Link

10 downloads 11710 Views 396KB Size Report
Aug 9, 2012 - database for the brain coral Platygyra carnosus, a structure- forming dominant .... used to classify the unigenes to understand the distribution of gene functions of the ... The overall GC content of the whole transcriptome was ..... Global Coral Reef Monitoring Network and Australian Institute of Marine Science.
Mar Biotechnol (2013) 15:244–251 DOI 10.1007/s10126-012-9482-z

ORIGINAL ARTICLE

PcarnBase: Development of a Transcriptomic Database for the Brain Coral Platygyra carnosus Jin Sun & Qian Chen & Janice C. Y. Lun & Jianliang Xu & Jian-Wen Qiu

Received: 30 March 2012 / Accepted: 20 July 2012 / Published online: 9 August 2012 # Springer Science+Business Media, LLC 2012

Abstract The aims of this study were to sequence the transcriptome and organize the sequence data into a searchable database for the brain coral Platygyra carnosus, a structureforming dominant species along the coast of southern China. We collected healthy and tumorous coral tissues from two locations, extracted RNA from each tissue sample, pooled the RNA from all tissue samples, generated a cDNA library from the pooled samples, and conducted paired-end sequencing of the cDNA library using the Illumina platform to produce 59.6 M clean sequences with a read length of 90 bp. De novo assembly of the sequence data resulted in 162,468 unigenes with an average length of 606 bp (range, 201 to 23,923 bp). This is the largest transcriptome dataset for a species of coral whose genome has not been sequenced. A BLASTx search against the NCBI protein database showed that 55,355 of the unigenes matched at least a sequence with an E-value of < 0.00001; 59 % of the matched sequences are from Metazoa, 13 % are from Alveolata to which the symbiont Symbiodinium belongs, and 7 % are from bacteria. A database (PcarnBase) was constructed to provide easy access to the unigenes with attributes such as NCBI protein annotation, GO annotation, and KEGG Electronic supplementary material The online version of this article (doi:10.1007/s10126-012-9482-z) contains supplementary material, which is available to authorized users. J. Sun : J.-W. Qiu (*) Department of Biology, Hong Kong Baptist University, Hong Kong, China e-mail: [email protected] Q. Chen : J. Xu (*) Department of Computer Science, Hong Kong Baptist University, Hong Kong, China e-mail: [email protected] J. C. Y. Lun Agriculture, Fisheries and Conservation Department, The Government of the Hong Kong Special Administrative Region, Hong Kong, China

pathway. It will facilitate functional genomic studies of P. carnosus, such as biomarker discovery for bleaching, tumor formation, and disease development at the gene or protein level, involvement of coral symbiotic algae in the host coral’s stress responses, and genetic basis of stress resistance. Keywords Coral . Platygyra . Transcriptome . Database . Next-generation sequencing

Introduction Coral reefs are one of the most diverse and productive ecosystems on Earth. They protect seashores, provide habitats for a variety of plants and animals, and are a source of food, income, and biological active substances including medicines (Nybakken and Bertness 2005). Despite their important ecological and economic values, coral reefs around the world are under the threat of global stressors such as climate change, ocean acidification, and rising sea levels (Hughes et al. 2003; Hoegh-Guldberg et al. 2007) and local stressors such as coastal development, coral mining, overfishing, illegal fishing method, pollution, and unsustainable tourism (Bryant et al. 1998). These stressors have resulted in a dramatic decline of coral reefs in many regions over the last several decades. A survey from 240 contributors from 98 countries indicated that approximately 20 % of the world’s coral reefs are dead, and 70 % of them are under immediate- or longer-term threat (Wilkinson 2004). In Southeast Asia, the threat to around 50 % of the coral reefs has been classified as high or very high (Burke et al. 2002). Along the coasts of southern China, there has been a great reduction in live coral cover and change in dominant coral species in Guangdong (Chen et al. 2007) and Hainan (Shi et al. 2007; Zhao et al. 2010) provinces over the last two decades.

Mar Biotechnol (2013) 15:244–251

Given the degradation of coral reefs in many regions around the world, the molecular responses to environmental stressors in corals and their symbiotic dinoflagellate under normal and stressful conditions have become a focus of research (Forêt et al. 2007). Such studies have the potential to detect early changes in physiology due to stress or injury at the gene or protein level, which is the first step in developing diagnostic biomarkers of disease and environmental stress. Studies have been conducted using cDNA microarrays to understand the expression of selected genes in association with natural diel cycles (Levy et al. 2011) and developmental stages (Grasso et al. 2008; Portune et al. 2010) in Acropora millepora, in response to turbidity in A. millepora (Bay et al. 2009), in response to thermal stress in Montastrea faveolata (DeSalvo et al. 2010) and Acropora palmata (Portune et al. 2010), and in response to temperature, salinity, and ultraviolet light conditions in M. faveolata (Edge et al. 2005). However, the microarrays are available for only a few species of corals (i.e., A. millepora, A. palmata, and M. faveolata) with extensive genomic resources such as expressed sequence tags. For most species of corals, the application of cDNA microarray is hampered by a lack of genomic resources (Traylor-Knowles et al. 2011). Recent development in DNA sequencing technology has enabled the rapid generation of genomic resources for coral species to support functional genomic studies (Meyer et al. 2009; Wang et al. 2010; Traylor-Knowles et al. 2011). In this study, we aimed to produce a set of high-coverage transcriptomic data using second-generation sequencing technology and organized these data into a searchable database with a user-friendly interface for the brain coral Platygyra carnosus (Veron 2000). Species in the genus Platygyra are common in the shallow water coral communities of the Indo-Pacific. In southern China, P. carnosus is a dominant species of scleractinian coral, forming fringing reef structures (Veron 2000). Species of Platygyra have been reported to suffer from many problems, such as hypoxia (BCL 1995), bleaching (Bhagooli and Hidaka 2004), corallivory (Lam et al. 2007), excessive bioerosion (Dumont et al., manuscript submitted for publication), white band disease (Coles 1994), white plague disease (Sutherland et al. 2004; Vargas-Ángel 2009), black band disease (Littler and Littler 1996; Thinesh et al. 2011), and tumor formation (Loya et al. 1984; Chiu et al. 2012). Nevertheless, there are very few genomic resources for Platygyra in public databases to support functional genomic studies in these ecologically important species. A search on 6 March 2012 revealed only 184 Platygyra sequences in the NCBI nucleotide database. Our comprehensive transcriptome sequencing and the PcarnBase sequences management may pave the way for future molecular mechanism studies of coral diseases.

245

Materials and Methods Transcriptomic Data Transcriptomic data of adult colonies of P. carnosus were obtained by sequencing using next-generation sequencing technology and de novo transcriptome assembly. The samples were collected from the field by SCUBA from water depths of 1–3 m. Four core samples of 2 cm in diameter × 2 cm in depth were collected, with one healthy tissue sample and one tissue sample having a growth anomaly (i.e., “tumor”) from two locations (Fig. 1 and ESM 1 of the “Electronic Supplementary Material”). Of the two locations, Sharp Island is close (~ 15 min by boat) to Sai Kung, a local fishing and pleasure boating center, whereas Hoi Ha Wan is relatively far away from Sai Kung and has much less intensive human activities (Chiu et al. 2012). The samples were immediately frozen using dry ice in the field. Within 2 h, the samples were transported to the laboratory and stored in a freezer at −80 °C. Total RNA from each sample was extracted using the TRIzol® reagent (Invitrogen, CA, USA) following the manufacturer’s instructions. Polysaccharides were removed by adding a high-salt solution (0.8 M sodium citrate and 1.2 M NaCl) in the RNA isopropanol precipitation step. RNA integrity was examined by using agarose gel electrophoresis and Bioanalyzer 2100 (Agilent Technologies, CA, USA). Equal amounts of RNA from different samples were pooled. Messenger RNA was enriched using a PolyATract® mRNA Isolation System (Ambion, Austin, TX, USA). Fragmentation buffer was added to break mRNA into approximately 200-bp fragments. Randomized hexamer–primer was used to synthesize the first strand of cDNA by using the short fragments as templates. The second strand of cDNA was synthesized by using dNTPs, RNaseH, and DNA polymerase I. Short fragments were purified using QiaQuick polymerase chain reaction (PCR) extraction kit (Qiagen, Valencia, CA, USA) and resolved with EB buffer for end reparation and adding a poly(A) tail. Afterwards, the

Fig. 1 A colony of P. carnosus showing growth anomaly (i.e., “tumor”) at the lower right corner and normal tissue on the remaining areas

246

short fragments were connected with sequencing adapters, and agarose gel electrophoresis was run. Suitable fragments were selected as temperate for PCR amplification (Wang et al. 2010). Double-stranded cDNAs were sequenced by using HiSeqTM 2000 (Illumina, San Diego, CA, USA). The raw reads were filtered to remove adaptors, reads with > 5 % unknown nucleotides, low-quality reads (more than 20 % of the bases in one read have a Q value less than 10, i.e., sequencing error rate less than 10 %), and those without overlap with other reads. The clean reads were deposited in NCBI Sequence Read Archive (http://www.ncbi.nlm.nih.gov/ Traces/sra) with accession number SRA049305.1. The clean reads were assembled using Trinity, a method for de novo assembly of RNA-Seq data (Grabherr et al. 2011). Three independent Trinity software modules, Inchworm, Chrysalis, and Butterfly, were applied sequentially to process the clean reads. Reads with overlapping sequences were merged to form contigs using Inchworm. The contigs were clustered, and a de Bruijn graph for each cluster was constructed using Chrysalis. The de Bruijn graphs were then processed and paired-endmatched to form longer sequences using Butterfly. The assembly parameters used in Trinity were set as “seqType 0 fq, min_contig_length 0 100, min_kmer_cov 0 2”, with the rest being default parameters. All of the nucleotide gaps were filled during assembly. In order to reduce the redundancy of the assembled sequences by Trinity, TIGR Gene Indices Clustering Tools (TGICL) was applied to obtain non-redundant clusters, and the rest which could not be clustered were named singletons (Pertea et al. 2003). Both clusters and singletons are called unigenes. Unigene Annotation, Classification, and Metabolic Pathway Analysis The unigene sequences from P. carnosus were searched against NCBI non-redundant protein databases using BLASTx (E-value threshold, 1e − 5). Protein function was predicted from annotation of the most similar protein in these databases. Blast2GO (Conesa et al. 2005) was used to obtain GO annotation of the unigenes into three functional ontologies: molecular function, cellular component, and biological process. Each ontology was further divided into a number of sub-functional categories. WEGO (Ye et al. 2006) was further used to classify the unigenes to understand the distribution of gene functions of the species at a macro level. The KEGG pathways annotation was performed using Blastall software against Kyoto Encyclopedia of Genes and Genomes database. Protein Translation The BLASTx result was used to determine the coding region of each unigene, and the coding region sequence was translated into amino acid sequences with a standard

Mar Biotechnol (2013) 15:244–251

codon table. Therefore, BLASTx resulted in both the nucleotide sequences (5′–3′) and amino sequences of the unigene coding region. Unigenes that had no match in any of the above databases were examined by ESTScan to predict their coding region (5’–3’) and translate them into amino acid sequences (Iseli et al. 1999). Comparison with Sea Anemone Genome In order to reveal proteins involved in coral skeleton formation, common proteins shared by both sea anemone Nematostella vectensis and the corals Acropora digitifera and P. carnosus were determined. First, the translated P. carnosus protein sequences were compared with the A. digitifera protein sequences using BLASTp under an E-value of 1e − 10 to determine genes shared by the two corals. To filter out the common genes shared by the two corals and the sea anemone, N. vectensis protein sequences were downloaded from Joint Genome Institute (http://www.jgi.doe.gov/). The common coral genes were then compared with the N. vectensis protein sequences using BLASTp under an E-value of 1e − 5. DAVID (Huang et al. 2009) was then applied to find out which groups of “non-sea anemone” coral genes were enriched. The accession numbers of the best hits from annotation by search against NCBI were submitted as the “non-sea anemone coral gene list”. The accession numbers of all of the proteins’ best hits from annotation by search against NCBI were submitted as the “background list”. The Gene Ontology option in DAVID was selected to classify the result. The results of the enrichment analysis were further corrected by Benjamini and Hochberg false discovery rate correction (Huang et al. 2009). Only results with corrected p< 0.05 were considered as significant. Database Structure A relational database, PcarnBase, was constructed using MySQL (version 5.1.31). PcarnBase is hosted on an Apache HTTP server (version 2.0.63) running Solaris 10. The search functions are powered by the ViroBLAST tool (Deng et al. 2007) using the PHP programming language. The database consists of seven tables, including five entity tables and two relation tables (Fig. 2). The entity table “NCBI annotation” comprises two attributes: a unique gene ID number (UnigeneID) and a short description of each gene (Description). The entity table “Proteins” includes three attributes: a unique protein ID number (PID), a gene sequence (Sequence), and a foreign key (UnigeneID) referencing to the gene ID in the table “NCBI annotation,” which could be null in case of no relevant annotation. The structure of the entity table “DNAs” is similar to that of the table “Proteins”. The entity table “Gene Ontology” has two attributes: a unique ID number of each gene class (GOclassID)

Mar Biotechnol (2013) 15:244–251

Fig. 2 Structure of PcarnBase showing the five entity and two relation tables and definitions of fields in each table. The cardinality constraint (e.g., 1:1, 1:m, etc.) shows the entity of a table that is connected to many entities of another table. Other definitions can be found in “Materials and Methods”

and its relevant ontology (Ontology). The relation table “NCBI_GO_relation” has three attributes: a unique relationship ID (NG_rid), a unique gene ID (UnigeneID), and a gene class ID (GOclassID), where UnigeneID and GOclass ID are foreign keys referencing to the entity tables “NCBI annotation” and “Gene Ontology,” respectively. The structures of the entity table “KEGG” and the relations table “NCBI_KEGG_relation” are similar to those of the entity table “Gene Ontology” and the relation table “NCBI_GO_relation.”

247

assembly (Fig. 3). The overall GC content of the whole transcriptome was 47.93 %. The average length of the unigenes was 606 bp, and the N50 length was 811 bp (range, 201 to 23,923 bp). A comparison with other species of scleractinian corals with a substantial amount of genomic resources showed that the sequences generated in this study are more numerous and longer than those of any other species (i.e., Montastrea faveolata, A. millepora, Pocillopora damicornis) without a sequenced genome (Table 1). These high-quality data were obtained largely due to the availability of a large amount of clean sequence reads (59.6 M), longer reads compared to an earlier version of Illumina sequencing (90 bp vs. 35 bp), and paired-end sequencing method used. A BLASTx search of the 162,468 unigenes against NCBI non-redundant protein database resulted in 55,355 matches with an E-value threshold of 1e − 5 and these were translated to proteins. ESTscan of the unigenes translated 37,439 additional proteins. To compare with the proteins of A. digitifera, the only species of scleractinian coral with a sequenced genome (Shinzato et al. 2011), a BLASTx search was also conducted against the 23,668 protein sequences deduced from its genome. A total of 47,732 (29 %) P. carnosus unigenes matched 14,419 (61 %) A. digitifera protein sequences. Among the P. carnosus unigenes with homologs in NCBI non-redundant protein database, 59 % (32,536 unigenes), 13 % (7,317 unigenes), and 7 % (3,888 unigenes) had top hits to protein sequences from Metazoa, Alveolata, to which the symbiont Symbiodinium belongs, and Bacteria, respectively (Fig. 4). Among the hits matched to Metazoa, Cnidaria accounted for 50 % (16,391 unigenes) and Chordata accounted for 26 % (8,587 unigenes) (Fig. 4). Among the hits matched to Cnidaria, N. vectensis and Hydra magnipapillata represented 91 and 6 %, with 14,991 and 1,086 hits, respectively. A search of NCBI protein database on 8 May 2012 revealed only 635 sequences from 100000 Contigs

Results and Discussion Transcriptome Data Our transcriptome sequencing produced 83,696,466 raw sequencing data with a read length of 90 base pairs (bp). Among them were 59,550,974 clean reads with a total of 5,359,587,660 nucleotides. Of the clean reads, 95.28 % had a Q20 (sequencing error rate less than 1 %). De novo transcriptome assembly using Trinity and TGICL produced 359,267 contigs and 162,468 unigenes (including 4,076 clusters and 158,392 singletons). Among these, 56,288 (15.7 %) contigs and 63,469 (39.1 %) unigenes were over 500 bp; 20,431 (5.69 %) contigs and 24,481 (15.1 %) unigenes were over 1,000 bp, indicating the high quality of the

Number of sequences

Unigenes

10000

1000

100 200-500

500-1000

1000-2000

2000-3000

>3000

Sequence length (bp)

Fig. 3 Length distribution of contigs and unigenes of the assembled transcriptome of P. carnosus

248

Mar Biotechnol (2013) 15:244–251

Table 1 Summary of DNA sequences in the published genome of A. digitifera and several large transcriptomic datasets of scleractinian corals Species

Sequencing platform

Montastrea faveolataa Acropora palmataa Acropora millepora Acropora digitiferab

Sanger

0.004

Sanger

0.014

454 GS-FLX 454 GS-FLX and Illumina GAIIx 454 GS-FLX

Pocillopora damicornis Platygyra carnosus a

Illumina HiSeq 2000

Reads (M)

Avg. length (nt)

Yield (Mb)

>500 bp

> 1 kb

>2 kb

>3 kb

Reference

Approximately 2 7

1,576

0

0

0

Schwarz et al. 2008

7,721

0

0

0

Schwarz et al. 2008

0.63

Approximately 500 Approximately 500 232

146

18,229

6,908

857

104

Meyer et al. 2009







15,317

8,691

3,515

1,579

Shinzato et al. 2011

0.96

379

362

56,263

19,991

1,401

59

Traylor-Knowles et al. 2011

90

5,360

63,470

24,482

4,946

1,281

This study

59.6

Data for M. faveolata and A. palmate were downloaded from NCBI EST database, and the following analysis was based on the raw EST sequences

b

Transcripts deduced from the whole genome of A. digitifera were downloaded from the OIST Marine Genomics Unit (http://marinegenomics.oist.jp/ genomes/download?project_id03) and sorted according to sequence length to generate the data

Symbiodinium, which indicates an under-representation of sequences from these coral-symbiotic algae. Our BLASTx search of P. carnosus unigenes detected 157 matches to a sequence from Symbiodinium, which was similar to the result from searching a recently assembled transcriptome of P. damicornis (142 matches) (Traylor-Knowles et al. 2011). A GO analysis showed that the highest percentage of P. carnosus sequences belonged to biological process (45 %), followed by cellular component (36 %) and molecular function (19 %) (Fig. 5). A BLAST search against the KEGG database revealed that 39,459 unigenes matched at least one sequence, and these unigenes were further grouped into 242 pathways (ESM 2 of the “Electronic Supplementary Material”). Multiple BLAST comparisons showed that the 5,656 unigenes were specific to the two species of corals, i.e., no significant (E-value less than 1e − 5) BLAST hit to the sea anemone N. vectensis sequences. Among them, 2,585 unigenes could Fig. 4 Taxonomic distribution of P. carnosus unigene BLASTx hits against NCBI’s nr protein sequence database. The relative abundance of hits in main taxonomic groups is represented as a percentage of the total number of hits

be annotated as mentioned earlier, so the rest could be interpreted as “taxonomically restricted genes” which may exhibit a positive selection signature (Voolstra et al. 2011). An analysis using DAVID showed that several categories, especially nucleotide binding and nucleoside binding, were enriched in molecular function, and two categories, cell adhesion and biological adhesion, were enriched in the biological process (Table 2). Apextrin, which has been reported to play a key role in the metamorphosis of A. millipora by Grasso et al. (2008), was found in our non-sea anemone coral gene list. Two NACHT-domain-containing genes and three NB-ARCdomain-containing genes, which were considered to be primary intracellular pattern receptors in innate immunity and were restricted to corals, were also found in our non-sea anemone coral gene list. These NACHT/NB-ARC domain genes may in part reflect adaptations associated with the symbiont (Shinzato et al. 2011). In general, this list of nonanemone coral genes may be associated with the establishment of a symbiotic relationship between Symbiodinium and

Mar Biotechnol (2013) 15:244–251

249

Fig. 5 GO functional annotation and abundance of P. carnosus unigenes, which are classified into three major categories (ontologies): biological process, molecular function, and cellular component. Each of these three categories is further divided into several functional subcategories

Percent unigenes

Biological process

Cellular component

Molecular function

0.001

0.01

0.1

1

10

100

1e+2

1e+3

1e+4

transporter activity translation regulator activity structural molecule activity receptor regulator activity receptor activity protein binding transcription factor activity nucleic acid binding transcription factor activity molecular transducer activity metallochaperone activity enzyme regulator activity channel regulator activity catalytic activity binding antioxidant activity virion synapse part synapse organelle part organelle membrane-enclosed lumen macromolecular complex extracellular region part extracellular region cell part cell junction cell viral reproduction signaling rhythmic process response to stimulus reproductive process reproduction regulation of biological process positive regulation of biological process pigmentation negative regulation of biological process multicellular organismal process multi-organism process metabolic process locomotion localization immune system process growth establishment of localization developmental process death cellular process cellular component organization or biogenesis cell proliferation cell killing biological regulation biological adhesion

1e+0

1e+1

Number of unigenes

corals and could be targets of coral bleaching studies. Overall, the enrichment of cell or biological adhesion-related unigenes Table 2 Enriched gene categories deduced from DAVID. Only those with a P-value less than 0.05 are considered to be enriched

a

Significance level was corrected by Benjamini and Hochberg false discovery rate correction

BP biological process, MF molecular function

is consistent with the suggestion of the presence of “coralspecific” processes such as calcareous skeleton formation

Category

Term (GO number)

Count (%)

P-valuea

BP BP MF MF MF MF MF MF MF

Cell adhesion (GO:0007155) Biological adhesion (GO:0022610) NAD + ADP-ribosyltransferase activity (GO:0003950) Transferase activity, transferring pentosyl groups (GO:0016763) Nucleotide binding (GO:0000166) Purine nucleotide binding (GO:0017076) Purine ribonucleotide binding (GO:0032555) Ribonucleotide binding (GO:0032553) Adenyl nucleotide binding (GO:0030554)

15 (1.7) 15 (1.7) 7 (0.8) 7 (0.8) 35 (4.1) 33 (3.8) 31 (3.6) 31 (3.6) 28 (3.2)

0.010 0.010 5.2e − 3.0e − 3.0e − 3.3e − 3.6e − 3.6e − 4.1e −

5 4 3 3 3 3 3

MF MF MF MF

Purine nucleoside binding (GO:0001883) Nucleoside binding (GO:0001882) ATP binding (GO:0005524) Adenyl ribonucleotide binding (GO:0032559)

28 (3.2) 28 (3.2) 26 (3.0) 26 (3.0)

4.1e 3.9e 6.1e 5.6e

− − − −

3 3 3 3

250

(Grasso et al. 2008). The full non-anemone coral unigenes and their annotation are shown in ESM 3 of the “Electronic Supplementary Material”. Database Utility PcarnBase can be accessed freely via web interface at http:// www.comp.hkbu.edu.hk/~db/PcarnBase/index.php. The database is searchable by BLAST and by a number of query terms. The Pcarn BLAST supports both basic and advanced search interfaces. The basic search allows users to blast query sequences against the local P. carnosus DNA/protein database, while the advanced search enables users to customize the blast options. Upon submitting a blast search, the user will be presented with the gene or protein sequences that match the query sequence with E-value and score evaluation. If the returned proteins or DNAs contain the attribute “UnigeneID,” the corresponding annotations will be displayed. The General Annotation Search allows one to query relevant annotation, i.e., NCBI annotation, GO, and KEGG, using a unigene name. Each successful query returns a table that contains the unigene’s NCBI annotation, GO class IDs (if available), as well as KEGG pathways (if available). Both GO and KEGG pathway link to unigenes at the table GO or KEGG, respectively. The GO Annotation Search allows one to query the gene ontology using GO class ID (e.g.: “developmental process”) or GO Ontology (e.g.: “biological_process”). Each successful query returns a table including the Go class ID, GO Ontology, and the unigenes matched. The KEGG Annotation Search allows one to query the KEGG information (e.g.: “MAPK signaling pathway”). Each successful query returns a table including KEGG pathway, the relevant path ID, and the unigenes matched. In each KEGG pathway, it includes a map and several frames. The matched unigenes for each frame is shown when the mouse pointer is located over the frame. As a sample output of KEGG annotation search, we presented the human p38 mitogen-activated protein kinases (MAPK) signaling pathway (ESM 4 of the “Electronic Supplementary Material”). This pathway plays a pivotal role in regulating early stress-induced transcription to distinct stress stimuli, e.g., UV irradiation, heat shock, and osmotic shock, and is also related to cell differentiation and apoptosis (Whitmarsh 2010). Several MAPK genes in the corals P. damicornis and Seriatopora hystrix have been reported to respond to osmotic stress (Mayfield et al. 2010). Our result that most of the human MAPK genes are found in the P. carnosus transcriptome (in gray boxes) indicates that the coverage of our transcriptome dataset is high and provides strong support to the notion that this pathway is conserved. This pathway can serve as a future candidate for understanding how corals respond to environmental stimuli.

Mar Biotechnol (2013) 15:244–251

Conclusions We have generated a large set of transcriptome data for P. carnosus and constructed a database for these sequences. This project has also greatly enhanced the genomic resources for a clade of scleractinian corals (i.e., faviids + meandriinids; Fig. 1 in Traylor-Knowles et al. 2011) to which P. carnosus belongs. The database can support various basic and applied molecular studies. For example, studies can be conducted to examine the responses of the coral and its symbiotic dinoflagellate to various environmental stressors, changes in gene expression during disease development, and how different genotypes respond differently to certain environmental conditions. To examine the involvement of a set of selected genes responding to stress resistance, microarray platforms can be developed using transcript-specific oligonucleotide probes (Forêt et al. 2007). In addition, high-throughput tag sequencing approach, such as digital gene expression, can be adopted to compare the gene expression of different treatments at the whole genome level, without selection of target genes or metabolic pathways before the experiment (Morrissy et al. 2009; Wang et al. 2010). Such studies will provide scientific data to support management decisions for better protection of this regionally important species of scleractinian coral and other species of marine life associated with this species. The database can also be used for comparative research of mechanisms of stress responses between different species of corals. When more research results on this species are available in the future, we will update the database with more sequences and better sequence annotation. Accumulation of such transcriptomic data is a first step towards deciphering the P. caronsus genome. Acknowledgments This paper is the result of a collaboration between Agriculture, Fisheries and Conservation Department, Hong Kong SAR Government, and Department of Biology, Hong Kong Baptist University (HKBU). A grant to JX from HKBU supported the database construction. Transcriptome sequencing was conducted by Beijing Genomics Institute, Shenzhen, China.

References Bay LK, Ulstrup KE, Nielsen HB, Jarmer H, Goffard N, Willis BL, Miller DJ, Van Oppen MJ (2009) Microarray analysis reveals transcriptional plasticity in the reef building coral Acropora millepora. Mol Ecol 18:3062–3075 Bhagooli R, Hidaka M (2004) Photoinhibition, bleaching susceptibility and mortality in two scleractinian corals, Platygyra ryukyuensis and Stylophora pistillata, in response to thermal and light stresses. Comp Biochem Physiol A Mol Integr Physiol 137:547–555 Bryant D, Burke L, McManus J, Spalding M (1998) Reefs at risk. A map-based indicator of threats to the world’s coral reefs. World Resources Institute, Washington Burke L, Selig E, Spalding M (2002) Reefs at risk in Southeast Asia. World Resources Institute, Washington

Mar Biotechnol (2013) 15:244–251 Chen TY, Yu KF, Shi Q, Li S, Wang R, Zhao MX (2007) Distribution and status of scleractinian coral communities in Daya Bay, Guangdong. Trop Geogr 27:493–498 Chiu JM, Li S, Li A, Po B, Zhang R, Shin PK, Qiu JW (2012) Bacteria associated with skeletal tissue growth anomalies in the coral Platygyra carnosus. FEMS Microbiol Ecol 79:380–391 Coles SL (1994) Extensive disease outbreak at Fahl Island, Gulf of Oman, Indian Ocean. Coral Reefs 13:242 Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, Robles M (2005) Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21:3674–3676 BCL (Binnie Consultants Limited) (1995) 1994 Hypoxia and mass mortality event in Mirs Bay. Final report to the Geotechnical Engineering Office, Civil Engineering Department, Hong Kong Government Deng W, Nickle DC, Learn GH, Maust B, Mullins JI (2007) ViroBLAST: a stand-alone BLAST web server for flexible queries of multiple databases and user's datasets. Bioinformatics 23:2334–2336 DeSalvo MK, Sunagawa S, Fisher PL, Voolstra CR, Iglesias-Prieto R, Medina M (2010) Coral host transcriptomic states are correlated with Symbiodinium genotypes. Mol Ecol 19:1174–1186 Edge SE, Morgan MB, Gleason DF, Snell TW (2005) Development of a coral cDNA array to examine gene expression profiles in Montastraea faveolata exposed to environmental stress. Mar Pollut Bull 51:507–523 Forêt S, Kassahn KS, Grasso LC, Hayward DC, Iguchi A, Ball EE, Miller DJ (2007) Genomic and microarray approaches to coral reef conservation biology. Coral Reefs 26:475–486 Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29:644–652 Grasso LC, Maindonald J, Rudd S, Hayward DC, Saint R, Miller DJ, Ball EE (2008) Microarray analysis identifies candidate genes for key roles in coral development. BMC Genomics 9:540 Hoegh-Guldberg O, Mumby PJ, Hooten AJ, Steneck RS, Greenfield P, Gomez E, Harvell CD, Sale PF, Edwards AJ, Caldeira K, Knowlton N, Eakin CM, Iglesias-Prieto R, Muthiga N, Bradbury RH, Dubi A, Hatziolos ME (2007) Coral reefs under rapid climate change and ocean acidification. Science 318:1737–1742 Huang DW, Sherman BT, Lempicki RA (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4:44–57 Hughes TP, Baird AH, Bellwood DR, Card M, Connolly SR, Folke C, Grosberg R, Hoegh-Guldberg O, Jackson JB, Kleypas J, Lough JM, Marshall P, Nyström M, Palumbi SR, Pandolfi JM, Rosen B, Roughgarden J (2003) Climate change, human impacts, and the resilience of coral reefs. Science 301:929–933 Iseli C, Jongeneel CV, Bucher P (1999) ESTScan: a program for detecting, evaluating, and reconstructing potential coding regions in EST sequences. Proc Int Conf Intell Syst Mol Biol 138–148 Lam K, Shin PKS, Hodgson P (2007) Severe bioerosion caused by an outbreak of corallivorous Drupella and Diadema at Hoi Ha Wan Marine Park, Hong Kong. Coral Reefs 26:893 Levy O, Kaniewska P, Alon S, Eisenberg E, Karako-Lampert S, Bay LK, Reef R, Rodriguez-Lanetty M, Miller DJ, Hoegh-Guldberg O (2011) Complex diel cycles of gene expression in coral–algal symbiosis. Science 331:175 Littler MM, Littler DS (1996) Black band disease in the South Pacific. Coral Reefs 15:20 Loya Y, Bull G, Pichon M (1984) Tumor formations in scleractinian corals. Helgoland Mar Res 37:99–112 Mayfield AB, Hsiao YY, Fan TY, Chen CS, Gates RD (2010) Evaluating the temporal stability of stress-activated protein kinase and

251 cytoskeleton gene expression in the Pacific reef corals Pocillopora damicornis and Seriatopora hystrix. J Exp Mar Biol Ecol 395:215–222 Meyer E, Aglyamova GV, Wang S, Buchanan-Carter J, Abrego D, Colbourne JK, Willis BL, Matz MV (2009) Sequencing and de novo analysis of a coral larval transcriptome using 454 GSFlx. BMC Genomics 10:219 Morrissy AS, Morin RD, Delaney A, Zeng T, McDonald H, Jones S, Zhao Y, Hirst M, Marra MA (2009) Next-generation tag sequencing for cancer gene expression profiling. Genome Res 19:1825–1835 Nybakken JW, Bertness MD (2005) Marine biology: an ecological approach, 6th edn. Benjamin Cummings, San Francisco Pertea G, Huang X, Liang F, Antonescu V, Sultana R, Karamycheva S, Lee Y, White J, Cheung F, Parvizi B, Tsai J, Quackenbush J (2003) TIGR Gene Indices Clustering Tools (TGICL): a software system for fast clustering of large EST datasets. Bioinformatics 19:651–652 Portune KJ, Voolstra CR, Medina M, Szmant AM (2010) Development and heat stress induced transcriptomic changes during embryogenesis of the scleractinian coral Acropora palmata. Mar Genom 3:51–62 Schwarz JA, Brokstein PB, Voolstra CR, Terry AY and others (2008) Coral life history and symbiosis: functional genomic resources for two reef building Caribbean corals, Acropora palmata and Monastraea faveolata. BMC Genomics 9:97 Shi Q, Zhao MX, Zhang QM, Wang HK, Wang LR (2007) Growth variations of scleractinian corals at Luhuitou, Sanya, Hainan Island, and the impacts from human activities. Acta Ecologica Sin 27:3316–3323 Shinzato C, Shoguchi E, Kawashima T, Hamada M, Hisata K, Tanaka M, Fujie M, Fujiwara M, Koyanagi R, Ikuta T, Fujiyama A, Miller DJ, Satoh N (2011) Using the Acropora digitifera genome to understand coral responses to environmental change. Nature 476:320–323 Sutherland KP, Porter JW, Torres C (2004) Disease and immunity in Caribbean and Indo-Pacific zooxanthellate corals. Mar Ecol Prog Ser 266:273–302 Thinesh T, Mathews G, Edward JKP (2011) Coral disease prevalence in the Palk Bay, Southeastern India—with special emphasis to black band. Indian J Mar Sci 40:813–820 Traylor-Knowles N, Granger BR, Lubinski TJ, Parikh JR, Garamszegi S, Xia Y, Marto JA, Kaufman L, Finnerty JR (2011) Production of a reference transcriptome and transcriptomic database (PocilloporaBase) for the cauliflower coral, Pocillopora damicornis. BMC Genomics 12:585 Vargas-Ángel B (2009) Coral health and disease assessment in the U.S. Pacific remote island areas. B Mar Sci 84:211–227 Veron J (2000) Corals of the world. Australian Institute of Marine Science, Townsville Voolstra CR, Sunagawa S, Matz MV, Bayer T, Aranda M, Buschiazzo E, Desalvo MK, Lindquist E, Szmant AM, Coffroth MA, Medina M (2011) Rapid evolution of coral proteins responsible for interaction with the environment. PLoS ONE 6:e20392 Wang XW, Luan JB, Li JM, Bao YY, Zhang CX, Liu SS (2010) De novo characterization of a whitefly transcriptome and analysis of its gene expression during development. BMC Genomics 11:400 Whitmarsh SJ (2010) A central role for p38 MAPK in the early transcriptional response to stress. BMC Biol 8:47 Wilkinson C (ed) (2004) Status of the coral reefs of the world: 2004. Global Coral Reef Monitoring Network and Australian Institute of Marine Science Ye J, Fang L, Zheng H, Zhang Y, Chen J, Zhang Z, Wang J, Li S, Li R, Bolund L, Wang J (2006) WEGO: a web tool for plotting GO annotations. Nucleic Acids Res 34:W293–W297 Zhao MX, Yu KF, Zhang QM, Shi Q (2010) Long term change in coral cover in Luhuitou fringing reef, Sanya. Oceanologia Limnologia Sin 41:440–447

Suggest Documents