Appl Microbiol Biotechnol (2011) 90:1739–1754 DOI 10.1007/s00253-011-3268-5
GENOMICS, TRANSCRIPTOMICS, PROTEOMICS
Profiling of biodegradation and bacterial 16S rRNA genes in diverse contaminated ecosystems using 60-mer oligonucleotide microarray Ashutosh Pathak & Rishi Shanker & Satyendra Kumar Garg & Natesan Manickam
Received: 19 February 2011 / Revised: 16 March 2011 / Accepted: 16 March 2011 / Published online: 19 April 2011 # Springer-Verlag 2011
Abstract We have developed an oligonucleotide microarray for the detection of biodegradative genes and bacterial diversity and tested it in five contaminated ecosystems. The array has 60-mer oligonucleotide probes comprising 14,327 unique probes derived from 1,057 biodegradative genes and 880 probes representing 110 phylogenetic genes from diverse bacterial communities, and we named it as BiodegPhyloChip. The biodegradative genes are involved in the transformation of 133 chemical pollutants. Validation of the microarray for its sensitivity specificity and quantitation were performed using DNA isolated from wellcharacterized mixed bacterial cultures also having nontarget strains, pure degrader strains, and environmental DNA. Application of the developed array using DNA extracted from five different contaminated sites led to the detection of 186 genes, including 26 genes unique to the individual sites. Hybridization of 16S rRNA probes revealed the presence of bacteria similar to well-characterized genera
involved in biodegradation of various pollutants. Genes involved in complete degradation pathways for hexachlorocyclohexane (lin), 1,2,4-trichlorobenzene (tcb), naphthalene (nah), phenol (mph), biphenyl (bph), benzene (ben), toluene (tbm), xylene (xyl), phthalate (pht), Salicylate (sal), and resistance to mercury (mer) were detected with highest intensity. The most abundant genes belonged to the enzyme hydroxylases, monooxygenases, and dehydrogenases which were present in all the five samples. Thus, the array developed and validated here shall be useful in assessing not only the biodegradative potential but also the composition of environmentally useful bacteria, simultaneously, from hazardous ecosystems. Keywords BiodegPhyloChip . 60-mer oligoarray . Biodegradation genes . 16S rRNA genes
Introduction Electronic supplementary material The online version of this article (doi:10.1007/s00253-011-3268-5) contains supplementary material, which is available to authorized users. A. Pathak : N. Manickam (*) Environmental Biotechnology Division, Indian Institute of Toxicology Research (CSIR), P.O. Box 80, Mahatma Gandhi Marg, Lucknow 226 001, Uttar Pradesh, India e-mail:
[email protected] R. Shanker Environmental Microbiology Division, Indian Institute of Toxicology Research (CSIR), Lucknow, India S. K. Garg Centre of Excellence in Microbiology, Dr. R.M.L. Avadh University, Faizabad, India
Microarray technology has great advantages as it can be used for the simultaneous detection of a large number of genes and their expressions in the environment. The application of a microarray for environmental analyses was perhaps slower when compared to other areas of biological research. Microarray as a technology has been intensely improved upon with studies on the length of the oligonucleotide probe (He et al. 2005; Liebich et al. 2006), sensitivity (Kane et al. 2000; Letowski et al. 2004; Liu et al. 2007; Deng et al. 2008), and various array chemistries (Kumar et al. 2000; Mahajan et al. 2006). Microarrays have been primarily developed to profile the gene expression, especially genome-wide analysis in specific organisms (Schena et al. 1995; DeRisi et al. 1997; Bowtell 1999; Duggan et al. 1999; Lipshutz et al.
1740
1999). However, major applications have been made recently for environmental studies (Liu et al. 2003; Gao et al. 2004, 2007; Gentry et al. 2006) to understand the community structure and functional gene activities. Arrays have been successfully employed to monitor environmental processes such as biodegradation (Loy et al. 2002; Taroncher-Oldenburg et al. 2003; Zhou 2003; Bodrossy and Sessitsch 2004; Rhee et al. 2004; Steward et al. 2004; Tiquia et al. 2004; Zhou et al. 2004; Wu et al. 2006; He et al. 2007), environmental diagnosis, and community compositions (Stralis-Pavese et al. 2004; Gescher et al. 2008). A microarray was also used to understand the microbial diversity in extreme environments (Bond et al. 1995; Schut et al. 2001; Stralis-Pavese et al. 2004; Brodie et al. 2006; He et al. 2007; Garrido et al. 2008; Gebert et al. 2008) and to profile the global expression of genes in pure cultures or specific group of microorganisms and ecosystems (Gentry et al. 2006). To examine the presence of known genes involved in biodegradation and metal resistance 50-mer oligonucleotide microarrays for functional genes have been reported (Rhee et al. 2004; He et al. 2007). Compared to PCR-based probe microarrays, oligonucleotide probe microarrays have shown an advantage in terms of probe design and independence from culturability of the targets (Relógio et al. 2002; He et al. 2005). A high-density array containing 24,243 oligonucleotide probes covering about 10,000 genes was also designed and evaluated in extreme environments (He et al. 2007). To understand the bacterial diversity and composition (Liu et al. 2001; Koizumi et al. 2002; Wilson et al. 2002; Bonch-Osmolovskaya et al. 2003; Dar et al. 2007) and community structure analysis (Cho and Tiedje 2001; Jaccoud et al. 2001; Murray et al. 2001; Langenheder et al. 2006) microarrays based on 16S rRNA as probes have also been successfully employed. Here, we describe the development and application of a functional and 16S rRNA gene microarray comprising of 60-mer oligonucleotide probes. This hybrid array named as “BiodegPhyloChip” enabled us to simultaneously study the functional/biodegradative capabilities and bacterial diversity of various hazardous environments. We validated the designed microarray using DNAs of well-characterized pure culture strains and environmental DNA. The results demonstrate the successful employment of the designed microarray to understand the genetic capabilities and microbial communities in real hazardous environmental samples
Materials and methods Sequence collection and microarray construction The microarray developed contains a total number of 1,167 genes out of which 1,057 are involved in biodegradation
Appl Microbiol Biotechnol (2011) 90:1739–1754
pathways of various groups of chemicals and their metabolites. It also contains 110 genes representing the 16S rRNA genes of the bacterium involved in biodegradation. The list of genes was obtained from public domain databases like the University of Minnesota Biocatalysis and Biodegradation Database (http://umbbd.msi.umn.edu/) and The Institute of Genomic Research (TIGR) (http://cmr.tigr.org). The genes with their identification number (gi) and complete gene sequences were downloaded from the National Center for Biotechnology Information (NCBI) (http://www.ncbi.nih. nlm.edu) and TIGR. A total of 12 probes (six sense and six antisense) of 60-mer length were designed for each gene using the eArray software (http://earray.chem.agilent.com/ earray/) from Agilent Incorporation, Santa Clara, CA, USA. Each individual gene sequence was compared against the others in the sequence database using the NCBI basic local alignment search tool (BLAST), and an alignment with other sequences showing an 85% similarity was performed. Based on the results of the alignment, the segments of the 60-mer showing less than 85% nucleotide identity to the corresponding aligned segments of any BLAST hit sequences were selected as potential probes. The probes were designed to have an average melting temperature of 80 °C with a 15–20 °C variation. The probes were verified by BLAST and were found to be gene specific. The probes having self-hybridization potential and cross-hybridization potential for non-target sequences were removed in order to achieve maximum gene specificity. The probes selected were having a greater than 30-bp alignment length and an 85% identity. After annotation with the sequence information, the probe details were uploaded in the eArray software for an in situ synthesis of the microarray. Site description and sampling Soil and sediments were collected from the vicinity of five contaminated habitats distributed at different geographical locations of India: (1) the chloroaromatic chemicals and solvents manufacturing industry known as India Pesticide Limited, Chinhat, Lucknow (26°51′0″ north, 80°55′0″ east); (2) Gomti river sediment, Lucknow (26°51′30″ north, 80°56′14″ east); (3) heavy metal industry dump sites, Kanpur; (4) central effluent treatment plant (CETP) along the Ganges River near Kanpur (26°29′35″ north, 80°18′– 80°25′ east); and (5) the Mathura oil refineries, Mathura (27°27′0″ north, 77°43′12″ east). At each sampling location, multiple samples were collected from areas close to (100 m) and far from (500 m) the industry, along the effluent channel, and at the place where the effluent channel from the industrial area falls into the river. The soil samples collected in air tight vessels were transported on ice (4 ○C) to the laboratory and processed immediately for DNA isolation.
Appl Microbiol Biotechnol (2011) 90:1739–1754
Bacterial strains and growth conditions The Sphingomonas sp. strain NM05 (MTCC 8061) was reported (Manickam et al. 2008) for the biodegradation of γ-hexachlorocyclohexane (γ-HCH) and maintained in minimal medium (KH2PO4, 170 mg; Na2HPO4, 980 mg; (NH4)2 SO4, 100 mg; MgSO4·7H2O, 4.87 mg; FeSO4·7H2O, 0.05 mg; CaCO3, 0.20 mg; ZnSO4·7H2O, 0.08 mg; CuSO4·5H2O, 0.016 mg; H3BO3, 0.006 mg; dissolved in distilled water and volume made up to 100 ml, pH 7.6) containing 100 μg/ml γ-HCH. All the lin genes involved in biodegradation of γ-HCH from the strain NM05 have been characterized (Manickam et al. 2008). The strain Rhodococcus sp. RHA1 (Masai et al. 1995) was a gift from Dr. Stefan R. Kaschabek (Institute of Biosciences, TU Bergakademie Freiberg, Germany) and is a known degrader of a wide range of polychlorinated biphenyls, substituted phenols, benzoates, and phthalates (McLeod et al. 2006). The strain was harvested separately in minimal medium containing 0.2% biphenyl and 20 mM of benzoate as the sole source of carbon and energy. Also, a Bordetella sp. strain IITR02 (MTCC 10496) was isolated and characterized (GenBank accession no. EU752498) in our laboratory for its ability to grow on different triand di-chlorobenzenes. It was harvested after its growth on minimal medium with 3.2 mM of 1,2,4-trichlorobenzene as the sole source of carbon and energy. The strains Escherichia coli DH5α and BL21 were obtained from New England Biolabs (Bethesda, USA) and were grown in Luria-Bertani (LB) medium (peptone, 1.0 g; yeast extract, 0.5 g; NaCl, 0.5 g in 100 ml; pH, 7.0) at 37 °C overnight. An E. coli K12 (MTCC 1261) harboring plasmids with the size of 140, 60, 34, and 6.5 kb was obtained from the Microbial Type Culture Collection, Institute of Microbial Technology, Chandigarh, India. All the three E. coli strains were host bacteria and not reported to have biodegradative properties and were used in this study as non-target strains. For the DNA extraction, all the E. coli strains were grown on an LB medium at 37 °C with 150 rpm overnight. DNA extraction The total DNA to be used as target for the microarray hybridization was isolated from the soil/sediment samples using the Mo Bio Power soil DNA kit (MO Bio Inc., Carlsbad, CA USA) as per manufacturer’s instructions. Before processing for DNA isolation, the soil/sediments were mixed thoroughly using a mortar and pestle. One gram of soil/sediment was taken for DNA isolation, and during processing with the kit, a lysozyme was added to facilitate the lysis of the bacteria. The genomic DNA from pure bacterial cultures was isolated using the Wizard genomic DNA isolation kit (Promega Corpora-
1741
tion, Madison, WI, USA) as per manufacturer’s instructions. The concentration of the DNA was analyzed by a Nanodrop spectrophotometer (Nanodrop Technologies Inc, Rockland, USA), and the quality was determined by an analysis on the DNA 12000 kit (Caliper Sciences, USA) using the Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA, USA). Labeling of DNA extracted from soil The bacterial DNA samples were labeled as per Agilent’s comparative genomic hybridization protocol. Briefly, 0.5 μg of each bacterial genomic DNA was digested using the Alu1 and Rsa1 enzymes at 37 °C for 2 h. To the digested DNA, 2.5 μL of the random primers was added and incubated at 95 °C for 3 min and transferred to ice for 5 min. The DNA was labeled using Agilent Genomic DNA Labeling Kit PLUS (Agilent Technologies, Santa Clara, CA, USA). Briefly, 10 μL of Labeling Master Mix containing a Klenow enzyme and 1.5 μL of Cyanine-3dUTP dye were added to each reaction tube containing the digested DNA and incubated at 37 °C for 2 h. The enzyme was inactivated by heating to 65 °C for 10 min. A labeled gDNA was cleaned up using Microcon YM-30 filters (Millipore, MA, USA). The samples were diluted with 430 μl of 1X TE (pH 8.0), applied to the column and centrifuged at 10,000 rpm for 10 min. Another wash with 480 μL of 1X TE (pH 8.0) was done, and the DNA was eluted by inversion of the column and centrifugation at 8,000 rpm for 1 min. The DNA was concentrated to a final volume of 18 μL. The concentration of the labeled genomic DNA was measured using the NanoDrop 1000 spectrophotometer. Microarray hybridization The labeled genomic DNA samples (16 μL) were hybridized on to the developed multibacterial microarray using Agilent Oligo aCGH Hybridization Kit (Agilent Technologies, Santa Clara, CA, USA). The labeled samples were mixed with 4.5 μL of blocking buffer, 2 μL of ssDNA, and 22.5 μL of hybridization buffer. Hybridization was carried out for 24 h at 65 °C in a rotatory hybridization oven (Agilent Technologies, Santa Clara, CA, USA). The hybridized slides were washed in Agilent CGH wash buffer 1 (part no. 5188–5226, Agilent Technologies, Santa Clara, CA, USA) for 1 min at ambient temperature followed by a wash with gene expression wash buffer 2 (part no. 5188–5226, Agilent Technologies, Santa Clara, CA, USA) at 37 °C for 5 min. The labeled sample of India Pesticide Limited (IPL) was hybridized in duplicate to the array to serve as the technical replicates.
1742
Hybridization for specificity determination using mixed culture In order to determine the target–probe specificity of the microarray, we have mixed well-characterized organisms reported for the biodegradation and bacterial hosts. The three degrader strains were Sphingomonas sp. NM05, Rhodococcus sp. RHA1, and Bordetella sp. IITR02 known to biodegrade hexachlorocyclohexane, biphenyl, and 1,2,4-trichlorobenzene respectively. E. coli K12, DH5α, and BL21 (non-targets) were host cells and were mixed in defined ratios as described below. The first mixture had each 107 cells per milliliter, and all six cultures were mixed in equal ratios. In the second mixture, the cultures involved in biodegradation were mixed with non-target cultures in a 2:1 ratio. It has been shown by Trevors et al. (2010) that 107 bacterial cells yield approximately 500 ng of the genomic DNA, and hence we prepared the third mixture of 500 ng of genomic DNA of each of the six cultures. The fourth mixture contained cells of all the three degraders in equal ratios (1:1:1) without the non-target cells. The fifth mixture included only the three non-degraders in equal ratios (1:1:1). The sixth mixture contained 2×107 cells of NM05 and 1×107 cells each of IITR02 and RHA1 strains in the absence of non-target strains. In the seventh combination, 2×107 cells of strain IITR02 were mixed with 1×107 cells each of strains NM05 and RHA1, without the non-degraders, and the eighth mixture contained 2×107 cells of RHA1 with 1×107 cells each of NM05 and IITR02 without the non-target strains. The genomic DNA from all the above mixtures was isolated using the Wizard genomic DNA isolation kit (Promega Corporation, Madison, WI, USA) as per the manufacturer’s instructions. DNA concentrations was measured by Nanodrop spectrophotometer (Nanodrop Technologies Inc., Rockland, USA), and the quality was determined using the DNA 12000 kit (Caliper Sciences, USA) on the Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA, USA). The fluorescent Cy3 labeling of the mixed target DNA was performed as described above. Verification of the microarray data using polymerase chain reaction In order to validate the microarray results, two different genes were amplified from the soil DNA of the IPL samples and confirmed by nucleotide sequencing. A linA gene coding for a dehydrodechlorinase which initiates the γ-HCH degradation was repeatedly amplified. A specific primer set of linAF (5′-GCGGATCCGCATGAGTGATCTAGACAGACTT-3′) and linAR1 (5′-GCCTCGAGTTATGCGCCGGACGGT GCGAAATG-3′) (Kumari et al. 2002) was designed for the detection of the linA gene (471 bp) from the soil DNA. Similarly, the tcbAa gene coding for a chlorobenzene dioxygenase α-subunit was also amplified from the same
Appl Microbiol Biotechnol (2011) 90:1739–1754
soil sample. We have designed a primer pair tcbAaF (5′-GAAGGATCCATGGGCATGAATCACACCGAC-3′) and tcbAaR (5′-TTCAAGCTTGCGTGTGGCGTTCAGT GC-3′) based on the sequences of the tcbAa gene reported (EU825676) from our isolate Bordetella sp. with the capability to utilize 1,2,4-TCB. The PCR reactions were performed with an initial denaturation at 95 °C followed by 30 cycles of 94 °C (60 s), 50 °C (45 s), and 72 °C (60 s), with a final extension at 72 °C for 7 min, and the amplified product was analyzed on 1% agarose gel. Scanning and calculation of hybridization signal Slides were immediately scanned using an Agilent microarray scanner (Agilent Technologies, Santa Clara, CA, USA). Raw data were extracted using the Agilent Feature Extraction software version 9.3, and an analysis was done using the GeneSpring GX 7.3 (Agilent Technologies, Santa Clara, CA, USA). Data analysis All the features in the microarray were normalized to the 50th percentile to arrive at a fold detection value. The raw median signal intensity of each feature was normalized against the median local background signal obtained from the raw data files to get the processed (dye normalized and background subtracted) signal intensity. Probes that were flagged as “positive and significant” based on their signal to noise ratio (SNR) of ≤3 and a p value 1,000-fold) signal intensity was observed for these two cultures, the signal intensities of all the other 16S rRNA probes had signal intensity comparable to the background. Whereas comparable signal intensity values were obtained when cells of NM05 were doubled compared to IITR02 and RHA1 (combination 6). Thus, the comparable signal intensity suggests that maximum hybridization was achieved even at a lower concentration (∼500 ng) of target DNA and increasing the concentration above the optimum did not lead to any substantial increase in signal intensity. The detection specificities of the microarray for functional gene probes were also evaluated using three different degradation pathways. The mixed DNA hybridization results led to the detection of specific genes involved in
Appl Microbiol Biotechnol (2011) 90:1739–1754
HCH, biphenyl, and 1,2,4-trichlobenzene from strains NM05, RHA1, and IITR02 respectively (GEO data; Accession number GSE2435; Data title MC_Val_G1 to G8). The γ-HCH degradation genes (linA, linB, linC, linD, linE, linF, and linX) of NM05 showed very high signal intensity (>2,000-fold) above the background (Fig. 1). Similarly the 1,2,4-TCB degradation genes (tcbAa, tcbAb, tcbAc, tcbAd, tcbB, tcbC, tcbD, tcbF) present in the strain IITR02 resulted in high signal intensity of about >2,000fold (Fig. 1). Also the genes present in RHA1 such as biphenyl dioxygenase (bphA) and dihydrodiol dehydrogenase (bphB) involved in the degradation of biphenyl, salicylate hydroxylase (nahG) involved in the degradation of naphthalene, chlorocatechol-1,2-dioxygenase (tfdC), chloromuconate cycloisomerase (tfdD), dienelactone hydrolase (tfdE), and maleylacetate reducatse (tfdF) involved in the degradation 2,4-dichlorophenoxyacetic acid, tod genes (todC1 and todF) involved in the degradation of toluene, were detected with high signal intensity (>500–1,000-fold) above the background. Though we have used mixed DNA of non-target strains along with the target strains the results indicate that the developed array can specifically detect the target genes from complex mixed communities (Fig. 1). Increasing DNA concentrations of any of the target cells (NM05, IITR02, and RHA1) to double, led to a significant signal enhancement to the original values in case of detecting the functional genes. When genomic DNA of the pure bacterial cultures was mixed in equal concentrations (combination 3), the signal values increased substantially as compared to combination 1 where all the 6 cells were mixed before the extraction of DNA. This may be due to presence of non-target DNA that possibly affected the hybridization efficiency. Determination of detection sensitivity of developed 60-mer array In the natural habitats, diverse microorganisms are present in different abundance levels and leads to extremely low representation. This could affect the perfect estimation of functional genes present in them and may grossly affect the detection sensitivity of the array. Hence, to evaluate the detection sensitivity of the array, labeled genomic DNA from Sphingomonas sp. strain NM05 at the concentration of 300, 500, 800, 1,000, and 1,250 ng were hybridized. At 65 °C, the hybridization temperature, the genes linA, linB, linC, linD, linE, linF, linX showed comparable signal intensities at all the concentrations which was more than 500-folds than the background signal intensity. The high value signal detection is what was expected to be, as the soil may be enriched with the the Sphingomonas sp. strain NM05. Other gene probes showed barely detectable signals than the background signal intensities. Comparable signal
Appl Microbiol Biotechnol (2011) 90:1739–1754
1745
Fig. 1 The distribution of background subtracted signal intensity levels for 40 functional genes obtained upon hybridization of Cy3labeled genomic DNA from mixed cultures (degrader strains: NM05, IITR02, and RHA1) in the presence of non-target strains (nondegraders: DH5α, BL21, and K12). The genomic contents of all the
six strains used in the study are known and they were mixed in defined ratio for evaluation of specificity of the developed microarray. M1, mixed culture that contained equal number of cells (107) of all six strains (both degrader and non-degrader strains), M2 mixed culture that contained equal number of cells (107) of only non-degrader strains
intensities were observed at all the concentrations, showing that increasing the amount of labeled samples does not affect the signal intensity as the spot saturation is achieved even at a low concentration of 300 ng of labeled DNA. In order to evaluate the detection sensitivity of the array10, 30, and 50 ng of diluted DNA extracted from the IPL sample was Cy-3-labeled and hybridized to the array. At 10 ng DNA concentrations, a dichlorophenol monooxygenase (tfdB) showed the strongest hybridization signal, and at 30 ng concentration additionally an aniline dioxygenase (atdA3) was also shown to have a significant hybridization with a threefold signal of tfdB. When 50 ng of DNA was hybridized with the array, a total number of 272 genes showed hybridization signals higher than the background suggesting that 50 ng may be enough to study the detection of various different biodegradation genes from complex environmental DNA.
obtained from each environment and showed a higher value compared to all other genes hybridized in that particular sample. From all the five environments studied, the genes representing 27 different biodegradation pathways were detected (Table 2). Among them, many degradation pathways were also detected in more than one environment based on similar values of signal intensities obtained. Altogether, a total number of 984 probes were designated as significantly detected based on the p value (≤0.05) of Cy3 signal intensities (Fig. S1, supplementary material). A maximum number of 20 different complete degradation pathway genes were detected from river bed samples, six from the chloroaromatic industrial contamination site, five from the heavy metal-contaminated site, and three in the petroleum-contaminated environment (Table 2). From the CETP site, no complete pathway of genes was detected; however, several individual genes were present. The complete pathways of genes for the degradation of chlorobenzene, carbazole, pentachlorophenol, biphenyl, chlorocatechol, naphthalene, phenanthrene, cyclohexane, toluene, xylene, phthalate, 2,4-dichlorophenoxyacetic acid, aniline, benzene, halobenzoate, sulfocatechol, alkane, catechol, dibenzofuran, and chlorobenzoate from the DNA of the river bed site were detected (Table 2). The river bed receives effluents from different types of industries and runoffs from agricultural fields. This leads to highly persistent levels of various groups of contaminants in the river bed site, and therefore genes and pathways for degradation. Specifically, in chloroaromatic contamination, genes involved in the complete biodegradation of benzene, chlorobenzene, carbazole, chlorobenzoate, γ-hexachlorocyclohexane (lindane), and 1,2,4-trichlorobenzene (1,2,4-TCB) were
Detection of the functional genes in the contaminated site samples Based on the cross reactive spot intensity, the presence of many genes encoding for a complete degradation pathway for various contaminants was observed in five different complex contaminated environments. The spots showing an SNR ratio of ≤3 and a p value of