A gene family of putative immune recognition ... - Springer Link

1 downloads 0 Views 850KB Size Report
Jan 11, 2007 - genome equivalents of a H. symbiolongicarpus bacterial artificial .... (Frank et al. 2001). H. .... ing, and sequencing a 994-bp cDNA fragment from H. ...... Shiina N, Tateno H, Ogawa T, Muramoto K, Saneyoshi M, Kamiya H.
Immunogenetics (2007) 59:233–246 DOI 10.1007/s00251-006-0179-1

ORIGINAL PAPER

A gene family of putative immune recognition molecules in the hydroid Hydractinia Ryan S. Schwarz & Linda Hodes-Villamar & Kelly A. Fitzpatrick & Matthew G. Fain & Austin L. Hughes & Luis F. Cadavid

Received: 15 March 2006 / Accepted: 24 October 2006 / Published online: 11 January 2007 # Springer-Verlag 2007

Abstract Animal taxa display a wide array of immunetype receptors that differ in their specificities, diversity, and mode of evolution. These molecules ensure effective recognition of potential pathogens for subsequent neutralization and clearance. We have characterized a family of putative immune recognition molecules in the colonial hydroid Hydractinia symbiolongicarpus. A complementary DNA fragment with high similarity to the sea urchin Lrhamnose-binding lectin was isolated and used to screen 9.5 genome equivalents of a H. symbiolongicarpus bacterial artificial chromosome library. One of the resulting 19 positive clones was sequenced and revealed the presence of a 5,111-bp gene organized in 13 exons and 12 introns. The gene was predicted to encode a 726-amino acid secreted modular protein composed of a signal peptide, an anonymous serine-rich domain, eight thrombospondin type 1 repeats, and a L-rhamnose-binding lectin domain. The molecule was thus termed Rhamnospondin (Rsp). Southern Nucleotide sequence data reported are available in the GenBank database under accession numbers DQ436480 and DQ446178– DQ446202. R. S. Schwarz : L. Hodes-Villamar : K. A. Fitzpatrick : M. G. Fain : L. F. Cadavid Department of Biology, The University of New Mexico, Albuquerque, NM 87131-0001, USA A. L. Hughes Department of Biological Sciences, University of South Carolina, Coker Life Sciences Building, 700 Sumter Street, Columbia, SC 29208, USA L. F. Cadavid (*) Departamento de Biología and Instituto de Genética, Universidad Nacional de Colombia, Cr. 30 #45-08, Bogota, DC, Colombia e-mail: [email protected]

hybridization and sequence analyses indicated the presence of a second Rsp gene. The cDNA from both Rsp genes was sequenced in 18 individuals, revealing high levels of genetic polymorphism. Nucleotide substitutions were distributed throughout the molecule and showed a significantly higher number of synonymous substitutions per synonymous sites than its nonsynonymous counterparts. Whole-mount in situ hybridization and semi-quantitative reverse transcription polymerase chain reaction of microorganism-challenged colonies indicated that Rsp molecules were specifically and constitutively expressed in the hypostome of gastrozooids’ mouth. Thus, the combination of (1) comparative analysis on domain composition and function, (2) polymorphism, and (3) expression patterns, suggest that Rsp genes encode a family of putative immune recognition receptors, which may act by binding microorganisms invading the colony through the polyp’s mouth. Keywords Hydractinia . Invertebrate immunity . TSR superfamily . Rhamnose-binding lectin . Polymorphism

Introduction Animals have evolved a wide spectrum of immune recognition strategies that ensure effective neutralization and clearing of potential pathogens. Most immune recognition molecules display either broad specificity and low diversity or narrow specificity and high diversity (Litman et al. 2005). Prominent among the latter are the immunoglobulins and T-cell receptors, the molecular hallmarks of the adaptive immune system of jawed vertebrates. These antigen-binding receptors diversify primarily through somatic recombination mediated by the recombination activating gene (RAG) system. The capacity to generate highly variable immune recognition

234

receptors, however, is not unique of jawed vertebrates. Invertebrate immune systems are also equipped with such molecules, although the variability is generated through RAG-independent mechanisms. Examples include V-regioncontaining chitin-binding proteins (VCBPs) in amphioxus (Cannon et al. 2002), the putative fusibility/histocompatibility (Fu/HC) molecule in the ascidian Botryllus schlosseri (De Tomaso et al. 2005), lipopolysaccharide (LPS)-inducible molecules in sea urchins (Nair et al. 2005), Down’s syndrome cell adhesion molecules (Dscam) in Drosophila (Watson et al. 2005), and immunoglobulin superfamily (IgSF)-domain-containing fibrinogen-related proteins (FREPs) in the snail Biomphilaria glabrata (Zhang et al. 2004). Structurally, at least one of three molecular frameworks are commonly found in most immune recognition receptors, namely, the IgSF framework, the leucine-rich repeat framework, and the lectin framework (Litman et al. 2005). Molecules of the latter play a fundamental role in both immune recognition and effector responses in invertebrate taxa (Quesenberry et al. 2003; Wang and Zhao 2004). Animal lectins are classified into several families based on their structure and carbohydrate specificity. They include galectins (Barondes et al. 1994), ficolins (Le et al. 1998), Ctype (Day 1994), P-type (Kornfeld 1992), I-type (Powell and Varki 1995), lily-type (Suzuki et al. 2003), heparin-binding proteins (Margalit et al. 1993), and pentraxins (Steel and Whitehead 1994). Most recently, L-rhamnose-binding lectins (RBLs) have been proposed as a new lectin family (Tateno et al. 2001). RBLs are found as single-domain proteins, as two or three tandemly repeated domains, or as part of modular proteins. The RBL domain is about 95-amino-acids long and is characterized by eight highly conserved cysteine residues and characteristic N- and C-terminal motifs (Tateno et al. 2002b). RBL was first purified from eggs of the sea urchin Anthocidaris crassispina (Ozeki et al. 1991), but it has also been found in eggs, spleen, skin mucus, leukocytes, thrombocytes, and serum of various species of fish (Hosono et al. 1999; Okamoto et al. 2005; Shiina et al. 2002; Tateno et al. 2002a,b). Furthermore, some types of fish RBL are known to play a role in immunity (Kudo and Inoue 1986; Shiina et al. 2002; Tateno et al. 2002a). A less prevalent molecular framework of immune recognition molecules is the thrombospondin type 1 repeat (TSR) superfamily. The TSR superfamily includes a large and diverse group of extracellular and transmembrane proteins involved in the regulation of extracellular matrix organization, cell–cell interactions, and cell guidance (Adams and Tucker 2000; Tucker 2004). Various TSR superfamily proteins play direct roles in immunity. They include the alternative complement pathway component properdin, the terminal complement proteins C6, C7, C8, and C9 (Goundis and Reid 1988), and the patternrecognition receptor mindin (He et al. 2004). The TSR

Immunogenetics (2007) 59:233–246

domain consists of approximately 60 amino acids folded into a three-stranded antiparallel, spiraling domain (Tan et al. 2002). Two of the strands have a β structure, whereas the other is irregular. The core of the structure is formed by interdigitating side chains of cysteine, tryptophan, and arginine from the three strands into layers. This pattern forms a positively charged face containing a groove-like structure that is thought to be the recognition site for ligands. The hydroid Hydractinia symbiolongicarpus and its sister species H. echinata (Cnidaria; Hydrozoa) are a commonly used model system in comparative immunology (Frank et al. 2001). H. symbiolongicarpus colonies are found in near-shore waters of northeastern USA, growing as a surface incrustation of gastropod shells occupied by pagurid hermit crabs (Buss and Yund 1989). Colonies are diploblastic, dioecious, and composed by a network of polyps interconnected through endodermal canals. Polyps are tubular structures with an outer ectoderm and an inner gastrovascular cavity lined with an endodermal epithelium. Polyps’ gastrovascular epithelium is continuous with that of the canals, and both are embedded into a two-dimensional ectodermal basal plate known as the stolonal mat. There are two main types of polyps in Hydractinia, gastrozooids and gonozooids. The former are feeding polyps and have the mouth set on an elevated hypostome, surrounded by tentacles. The tentacles are populated by nematocytes, or stinging cells, which capture and stun small animal prey and carry it to the mouth region where it is ingested whole. The latter function as gamete carriers, releasing their gametes into the water where fertilization occurs. Fertilized eggs develop into crawling planula larvae, which settle on hermit crab-occupied shells, and subsequently metamorphose into primary polyps. As canals extend, bifurcate, and anastomose, new polyps bud, generating thus a mature colony (Ballard 1942; Berking 1991). In this paper, we report the identification in H. symbiolongicarpus of two related genes predicted to encode multi-domain proteins containing a RBL domain and eight TSR domains. The genes were exclusively expressed in the hypostome of gastrozooids and did not appear to be upregulated after bacteria and fungi challenge. Complementary DNA sequencing in 18 individuals revealed high levels of nucleotide polymorphism. These molecules might represent immune-type receptors that bind microorganisms entering the colony through the mouth of gastrozooids.

Materials and methods Animal cultures Three partially inbred and 15 wild-type H. symbiolongicarpus colonies were used in this study. The three partially

Immunogenetics (2007) 59:233–246

inbred animals (BC34, F2-37, and RC110) were derived from the inbreeding program designed to genetically characterize Hydractinia allorecognition responses (Cadavid et al. 2004). RC110 was the product of a recombinant cross between near-isogenic colonies 338-8 and 4117-2. F2-37 was the offspring of a brother–sister mating between colonies RC79 and RC18, two RC110 full-siblings. BC34 was a fourth-generation backcross between colonies 3388 and 431-63. Wild-type colonies 12B, 13B, and 35B were collected from Guildford, CT, USA (provided by L. Buss and M. Nicotra, Yale University), and wild-type colonies WH02, WH03, WH04, WH13, WH14, WH15, WH25, WH26, WH27, WH28, WH31, WH36, and WH46 were collected from Woods Hole, MA, USA. For field-collected colonies, five- to ten-polyp fragments were explanted from colonies growing on gastropod shells onto microscope glass slides and held in position with a thread for 3–5 days until colonies attached to the glass. Animals were cultured at 18°C in artificial seawater (Coralife) and fed with Artemia salina nauplii three times a week with 15% water volume replacement. RNA isolation, RT-PCR, cloning, sequencing, and RACE Total RNA was isolated from a 20-polyp fragment of a 3day starved colony 35B, using the Trizol reagent (Invitrogen, Carlsbad, CA, USA). A reverse transcription polymerase chain reaction (RT-PCR) amplification was carried out with the Access RT-PCR system kit (Promega, Madison, WI, USA) following manufacturer’s recommendations and using a degenerate forward primer 5′-GG(A/T) TGTGG(A/T)GA(A/G)CAAACTATG-3′ and a (dT)17reverse anchor primer 5′-GACTCGAGTCGACATCGA (T)17-3′. A resulting 321-bp cDNA fragment was gelpurified with the QIAquick gel extraction kit (Qiagen, Valencia, CA, USA) and cloned using the pGEM-T easy vector system (Promega). Plasmid clones were sequenced using the BigDye Terminator v1.1 enzyme in an ABI Prism 3100 genetic analyzer (Applied Biosystems). The cDNA was used to screen the bacterial artificial chromosome (BAC) library (see below) and to design the initial walking primers to extend the sequence of the target gene in both 5′ and 3′ directions, using BAC clone 106O14. Full-length cDNA was amplified by RT-PCR using forward primer 1F (5′-GCTACCTGTGGTCGTAGTCACCTGCTCT-3′) and reverse primer 3R (5′-CAGAGATAAATATAATA CAAGTTTTTGC-3′), which anneal to the first exon and 3′ untranslated region (UTR), respectively. The cDNA product was subsequently cloned and sequenced as described above. To examine the levels of polymorphism, total RNA was isolated from the 18 colonies, as described above, and used as template for RT-PCR amplification with the Access RT-

235

PCR system kit (Promega) using Avian myeloblastosis virus reverse transcriptase for first-strand DNA synthesis and DNA polymerase from Thermus flavus for secondstrand cDNA synthesis and DNA amplification. The primers 2F (5′-GCACTAACCCCGAACCAGCT-3′) and 2R (5′-CGTATGGTCCACCTCTGCATCTGTAG-3′) were used to amplify a 866-bp fragment of the target gene’s 3′ region, chosen because it was able to be sequenced decisively in a minimal two sequencing reactions (forward and reverse). The cDNAs were cloned in pGEM-T easy vector (Promega), and 6 to 30 clones per individual were sequenced in both directions using SP6 and T7 vector primers. Genomic DNA was isolated from individuals WH04 and WH02 using the Trizol DNA extraction method (Invitrogen), and the RBL domain region of the Rsp1 locus was amplified using forward primer 6F (5′-GCGGATA CAAAACAGCAAGAG-3′) and reverse primer 3R (5′CAGAGATAAATATAATACAAGTTTTTGC-3′). Polymerase fidelity was evaluated by amplifying, cloning, and sequencing a 994-bp cDNA fragment from H. symbiolongicarpus actin gene using forward primer 5′GACAATGGATCTGGTATGTGC-3′ and reverse primer 5′CAATCCATACGGAGTATTTTCGC-3′, using the Access RT-PCR system kit (Promega) in a comparable manner, as was used to amplify the 866-bp fragment of the target gene described above. Products were cloned into pGEM-T easy vector (Promega), and 22 clones were completely sequenced in both directions as before. Sequences were aligned with ClustalX software (Thompson et al. 1994). The 5′ and 3′ ends of the target gene were amplified using Invitrogen’s 5′ and 3′ rapid amplification of cDNA ends (RACE) system essentially as described by the manufacturer. The 578-bp 5′ and 295-bp 3′ RACE products were purified, cloned, and sequenced as described above. BAC library construction and hybridization analysis Forty to fifty grams of tissue from inbred colony 4117-2 (Cadavid et al. 2004) was mechanically disrupted and embedded into several 1000-μl plugs of 1% InCert agarose (BMA, Rockland, ME, USA). The agarose-embedded tissue was sent to Amplicon Express (Pullman, WA, USA) for high-molecular weight DNA extraction, partial digestion with HindIII, and BAC library construction in vector pBECBAC1. The library had a 132.7-kb average insert size and contained 72,728 clones, corresponding to approximately 13 equivalents of the 7.5×105 kb Hydractinia genome (Cadavid et al. 2004). The library was spotted on high-density nylon membranes for hybridization analysis. The original 321-bp PCR product containing the RBL domain (see above) was labeled with digoxygenin (DIG)dUTP through random priming (Roche Diagnostics, Indianapolis, IN, USA) and used to screen 55,296 BAC clones

236

(9.5 genome equivalents). Hybridization was carried out overnight at 65°C in 5× saline sodium citrate (SSC; 0.3 M sodium citrate, 3.0 M sodium chloride, pH 7.0), 0.1% Nlaurylsarcosine, 0.02% sodium dodecyl sulfate (SDS), and 1% blocking reagent (Roche Diagnostics). Filters were washed twice for 5 min at room temperature in 2× SSC, 0.1% SDS, and twice for 15 min at 68°C in 0.5× SSC, 0.1% SDS. Chemiluminescent detection was done with CSPD (Roche Diagnostics), and autoradiography was performed with Blue Lite autorad films (ISC-BioExpress, Kaysville, UT, USA). Nineteen positive clones were identified and DNA was isolated from them using BACMAX DNA purification kit (Epicentre, Madison, WI, USA) or Qiagen large-construct kit. BAC clone DNA was digested with EcoRI, BamHI, or HindIII (Promega), run on a 0.7% agarose gel, and blotted onto Immobilon-NY+ nylon membrane (Millipore, Bedford, MA, USA) by capillary transfer in 20× SSC. The membrane was hybridized overnight at 65°C with the 321-bp RBL cDNA probe and washed and detected as above. PCR on DNA from four BAC clones was performed to confirm gene copy number using primers 4F (5′-CCAGTTGATGGTGGTTACAG-3′) and 1R (5′-ATCACCAAAGATACTAT-3′). Northern blot analysis Ten micrograms of total RNA from individual F2-37 was isolated using Trizol reagent (Invitrogen), loaded into a 1.5% agarose-formaldehyde gel, separated by electrophoresis, transferred to a positively charged Immobilon-NY+ membrane (Millipore) using upward capillary transfer in 20× SSC buffer (3 M NaCl, 0.3 M sodium citrate), and crosslinked using UV light (40,000 μJ/cm2). Hybridization of the membrane was performed in DIG Easy Hyb buffer (Roche Diagnostics) at 68°C with 2.5 μg of DIG-labeled RNA probe complementary to a 572-bp region of rhamnospondin predicted to encode the RBL, TSR8, and partial TSR7 domains (the same probe as described above for in situ hybridization). Chemiluminescent detection of the probe-target hybrid using alkaline phosphatase-conjugated anti-DIG antibody and CSPD alkaline phosphatase substrate (Roche Diagnostics) was performed according to manufacturer’s protocol and exposed to Biomax ML imaging film (Eastman Kodak, Rochester, NY, USA). Southern blot analysis Tissue from 3-day fasted H. symbiolongicarpus colony WH46 was lysed using a homogenization buffer (100 mM Tris-HCl pH 8.0; 100 mM NaCl; 200 mM sucrose; 50 mM EDTA; 0.5% SDS) and 50 μg proteinase K, then treated with 2.5 mg of RNase A (Sigma-Aldrich, St. Louis, MO, USA). Genomic DNA was extracted from the dissolved

Immunogenetics (2007) 59:233–246

tissue using phenol, phenol/chloroform/isoamyl alcohol (25:24:1), and chloroform/isoamyl alcohol treatments, followed by 100% isopropyl alcohol and 70% ethyl alcohol precipitations. DNA was resuspended in nuclease-free water. Twenty-microgram samples of gDNA were digested using restriction enzymes BamHI and EcoRI (Promega). Digests were subjected to electrophoresis overnight through a 0.9% agarose gel and transferred to positively charged Immobilon-NY+ membrane (Millipore) using upward capillary transfer in 20× SSC buffer. DNA was crosslinked to the membrane using UV light (20,000 μJ/cm2). A 359-bp cDNA containing the RBL domain and 52 bp of TSR8 was amplified from a plasmid with an Rsp1 insert using primers 5F (5′-GCAGATTAGGTGCATCAAGGGAAAC-3′) and 2R (given above under RT-PCR description). The amplicons were purified using QIAquick gel extraction kit (Qiagen), and 1.6 μg was used in a random-primed DNA labeling reaction with digoxigenin (DIG)-dUTP (Roche Diagnositics). Six hundred nanograms of DIG-labeled probe was used in a hybridization carried out overnight at 65°C in 5× SSC (3 M sodium chloride, 0.3 M sodium citrate, pH 7.0), 0.1% N-lauroylsarcosine, 0.02% SDS, and 1% blocking reagent (Roche Diagnositics). Membranes were washed twice under low stringency at ambient temperature in 2× SSC, 0.1% SDS and twice under high stringency at 65°C in 0.5× SSC, 0.1% SDS. Chemiluminescent detection was performed as described for hybridization of the BAC membranes. In situ hybridization A 572-bp cDNA spanning 87 bp of TSR7 and the complete TSR8 and RBL domains was amplified from individual 35B using forward primer 3F (5′-CTGTTCAAACC CAACTCCGAAGTA-3′) and reverse primer 2R (5′CGTATGGTCCACCTCTGCATCTGTAG-3′). The percentage of sequence identity determined from the predicted cDNAs of Rsp1 and Rsp2 across this 572-bp fragment is 92.5%. This cDNA was cloned into pGEM-T easy vector (Promega) from which sense and antisense DIG-labeled RNA probes were generated using in vitro transcription with SP6 and T7 RNA polymerase (Roche Diagnostics). Animals were starved ≥2 days before overnight fixation at 4°C with 4% paraformaldehyde in 0.1M HEPES (2hydroxyethyl-piperazine-1-ethane-sulfonic acid), pH 7.5, 0.42 M NaCl, and 2 mM MgSO4. After fixation, the tissue was decolored in stepwise alcohol washes (100% methanol, 100, 75, 50, and 25% EtOH) followed by three washes in PBT (phosphate-buffered saline containing 0.1% Tween20). Tissue was treated with proteinase K in PBT (10 μg/ ml), then stopped with two brief washes in 4 μg/ml glycine/ PBT and three washes in PBT. Samples were then treated with two 0.1M triethanolamine, two 0.1M triethanolamine/

Immunogenetics (2007) 59:233–246

2.5 μl/ml acetic anhydride, and three PBT washes. Tissue was re-fixed in 4% paraformaldehyde/PBT for 1 h at room temperature. Fixative was removed with three PBT and two 2× SSC washes. Prehybridization was done first in 2× SSC for 15 min at 70°C, and then in a 1:1 solution of 2× SSC and hybridization solution (50% formamide, 5× SSC, 1× Denhardt’s solution, 1.0% 100 μg/ml heparin, 0.1% Tween20, 0.1% CHAPS) for 10 min at 70°C, and in hybridization solution including 100 μg/ml transfer RNA for 2 h at 58°C. DIG-labeled RNA probe was added to fresh hybridization solution and incubated for 48–60 h at 58°C in a wet chamber under gentle agitation. Probe was removed with hybridization solution diluted with 2× SSC in a stepwise manner (100, 75, 50, and 25%) at 58°C followed by two washes in 2× SSC+0.1% CHAPS. Tissue was then washed at room temperature twice in MAB (100 mM maleic acid, 150 mM NaCl, pH 7.5) and once in MAB–BSA (MAB+ 1% BSA). Tissue was incubated in a blocking solution (80% MAB–BSA, 20% heat inactivated sheep serum) for 2 h at 4°C before adding anti-DIG antibodies (1/2,000) and incubating overnight at 4°C. Final washes were at room temperature: two in blocking solution, six in MAB, one in NTMT (100 mM NaCl, 100 mM Tris-Cl, 50 mM MgCl2, 0.1% Tween-20, pH 9.5), and one in NTMT/1 mM levamisole. BCIP/NBT (Roche Diagnostics) was added into NTMT and incubated with the tissue at 37°C. After staining, tissue was washed in water, methanol, and ethanol before embedding in euparol as whole mounts.

237

TATGTGC-3′ and reverse primer 5′-GATGGCAACATA CATAGCAGG-3′. PCR amplification was optimized to occur just below saturation by varying the template volume of each sample but keeping thermal cycling conditions consistent for each primer set (rhamnospondin or actin). Ten microliters of PCR product was run in 2% agarose gel and stained with ethidium bromide for visualization. Evolutionary analyses The number of synonymous nucleotide substitutions per synonymous site (dS) and the number of nonsynonymous nucleotide substitutions per nonsynonymous site (dN) were estimated by Nei and Gojobori’s (1986) method. The standard errors of mean dS and dN were estimated by the bootstrap method (Nei and Kumar 2000). Means of dS and dN in between-locus comparisons were estimated on a net basis; that is, the effect of polymorphism within loci was removed (Nei and Kumar 2000, p. 256). Tajima’s (1989) method was used to test the hypothesis that the pattern of allelic polymorphism is consistent with selective neutrality. Computations were conducted using the software packages Mega 3.1 (Kumar et al. 2004) and DNAsp 4.0 (Rozas et al. 2003).

Results Rhamnospondin gene architecture

Semi-quantitative RT-PCR of microorganism-challenged colonies Eight to ten polyp-fragments of colonies 12B and 35B were explanted onto glass slides and challenged with live Bacillus subtilis (1.045×109 cells/ml), Vibrio sp. (1.05× 109 cells/ml), or Saccharomyces cerevisiae (7.00×107 cells/ ml). The first two contain L-rhamnose in their cell wall, whereas the last one does not. From these microorganisms, only Vibrio is normally found in saltwater. Microorganisms were grown overnight in 500 ml of nutrient broth, pelleted and resuspended in 50 ml of sterile artificial seawater. Colonies in 50 ml of seawater were challenged for 1 hr at room temperature with 1 ml of the resuspended bacterial or fungal culture. Total RNA was isolated from colonies at 0-, 1-, 3-, 6-, 18-, and 48-h post challenge. First-strand cDNA was synthesized from 2.5 μg of total RNA using first-strand cDNA synthesis (Amersham Biosciences, Little Chalfont Buckinghamshire, England), and 1:10 dilutions were used as starting template for PCR amplification. Primers 3F and 2R were used to generate rhamnospondin-specific amplicons (see description from “In situ hybridization” above). Actin was also amplified from each sample as a loading control using forward primer 5′-GACAATGGATCTGG

A 321-bp cDNA fragment amplified by degenerate PCR from H. symbiolongicarpus colony 35B was isolated and showed highest similarity to amphioxus (Branchiostoma belcheri) RBL and sea urchin A. crassispina egg lectin (BLAST e=4×10−17 and 4×10−14, respectively). This cDNA fragment was used to screen a H. symbiolongicarpus BAC library of 55,296 clones corresponding to 9.5 genome equivalents. With this coverage, the probability of finding a single-copy gene in the library is >99%. The screening resulted in 19 positive clones, one of which (clone 106O14) was selected for sequencing to extend the original lectin fragment in both 3′ and 5′ directions. This BAC clone sequencing yielded a 5,185-bp gene model composed of 13 exons and 12 introns (Fig. 1a). The gene model was flanked by 5′-end putative TATA boxes at positions -55 (GCTGTA TATT) and -18 (TTTATATTTC) and by a 3′-end putative polyadenylation site at position 5,176 (GGATTAAAAC) (WebGene, http://www.itb.cnr.it/sun/webgene). Full-length cDNA sequence of 2,178 bp was obtained by RT-PCR with 5′ and 3′ RACE. This cDNA confirmed the intron/exon structure of the gene model (GenBank accession number DQ436480). Total RNA from adult H. symbiolongicarpus hybridized with a probe specific for a 572-bp 3′ terminal

238

Immunogenetics (2007) 59:233–246

region of the gene detected a single ∼2.5 kb RNA (Fig. 1b). This RNA is the expected size from the predicted 2,178-bp cDNA plus a poly(A) tail of ∼300 nucleotides. The gene was predicted to encode a 726-amino-acid secreted protein composed of a signal peptide, an Nterminal serine-rich domain (SRD), eight tandemly repeated TSRs, and a C-terminal RBL domain (Fig. 1a). The signal peptide, predicted to be 17-amino-acids long using the SignalP software (http://www.cbs.dtu.dk/services/SignalP/),

was encoded by exon 1 and the first 8 bp of exon 2. The remainder of exon 2 and all of exon 3 were predicted to encode the 144-amino-acid SRD, which showed no significant similarity to any sequence in BLAST searches. The SRD was abundant in hydrophilic residues (17 Ser, 14 Lys, 12 Glu, and 10 Asp residues) and was predicted to have a glycosaminoglycan attachment site (motif DDSAE) at position 135–139 (The Eukaryotic Linear Motif resource— http://www.elm.eu.org). Exons 3–11 were predicted to

Fig. 1 Gene architecture of Hydractinia’s rhamnospondin (Rsp). a Genomic DNA with exon/intron structure (upper) and cDNA with inferred domain organization (lower) of Rsp, with scale indicated in base pairs. The 5′ and 3′ untranslated regions are indicated by hashed boxes. Relevant primer sites and amplification direction are indicated by black arrows. Primer 2R was designed in exons 12 and 13 to span intron 13, as indicated by the hashed arrow. Restriction sites for EcoRI (E), BamHI (B), and HindIII (H) are indicated. Probe locations and sizes are indicated by the heavy bars in the lower part of (a). b Hybridization of rhamnospondin (Rsp) RNA as analyzed by Northern blot. A 572-bp DIG-labeled antisense RNA probe containing the RBL, TSR 8, and partial TSR 7 region of rhamnospondin was used to probe 10 μg of total RNA from H. symbiolongicarpus. RNA size marker positions (Roche Diagnostics) run in the same gel are indicated. c Alignment of the deduced amino acid sequence of the eight TSR domains from Rsp and TSR-2 and TSR-3 from TSP1. Sequence identity is indicated by black background and conservative changes, i.e., those preserving physicochemical properties, are shown

with gray background. Dashes denote gaps introduced to maximize the alignment. Paired cysteine residues known to form disulfide bonds in TSR-2 and TSR-3 are connected by brackets. Important structural and binding residues (W, S, and R) known from TSR-2 and TSR-3 are indicated with asterisks. The thrombospondin-1 motif involved in sulfated glycocongugate and CD36 interaction is indicated by the heavy bar. d Alignment of deduced amino acid sequence of Rsp RBL domain compared to sea urchin egg lectin (SUEL [A37961]), amphioxus (B. belcheri) RBL (RBL-Brbe, [AAT39418.1]), steelhead trout RBLs (STL1C and STL1M [BAB83628.1], STL2N and STL2C [BAA92256.2], STL3N and STL3C [BAA92257.3]), smelt (Osmerus lanceolatus) RBL (OLLN and OLLC [BAC11709.1]), and whitespotted char RBL (WCL3N and WCL3C, [BAB83629.1]). C, M, and N stand for C-terminal, Mid, and N-terminal, respectively, of tandemly repeated RBL domains. Conserved cysteine residues (asterisks) and motifs (heavy bars) typical of RBL domains are indicated. Accession numbers for these sequences are indicated in brackets

Immunogenetics (2007) 59:233–246

encode the eight TSRs, which were 57-amino-acids long, excepting TSR3 and TSR8, which had 59 amino acids. Amino acid sequence similarity among the eight TSRs ranged from 39–61%, whereas similarity between the eight TSRs and the TSR2 and TSR3 of human thrombospondin-1 ranged from 48–66%. TSRs showed the conserved 6 Cys, 1– 2 Trp, and 1–2 Arg that characterize the repeat (Fig. 1c), indicating that they might fold into the three-layered structure of the thrombospondin-1 TSRs (Tan et al. 2002). The TSRs have a similar motif at homologous position to the one found in thrombospondin-1 (CSVTCG), which binds sulfated glycocongugates and interacts with the cell receptor CD36 (Dawson et al. 1997). Furthermore, all TSRs had a motif similar to WSXWS and identical to RXR, both implicated in cell adhesion and binding to glycosaminoglycan (Tan et al. 2002). The RBL domain was encoded by exons 12 (98 amino acids) and 13 (6 amino acids). The RBL domain had the eight Cys residues as well as the sequence motifs ANYGR and DPCXGTXKYL (Fig. 1d), typical of known RBLs (Hosono et al. 1999; Ozeki et al. 1991; Tateno et al. 2001, 2002a). No such domain organization was found in sequence database searches, and accordingly, the gene was denominated Rhamnospondin (Rsp).

Fig. 2 Southern blot analysis of Rsp in H. symbiolongicarpus. a Hybridization of positive BAC clone DNA using a 321-bp RBL domain probe. Left panel, hybridization profile of BAC clone 106O14, the one from which Rsp was sequenced. Right panel, hybridization profile of five other randomly selected Rsp positive BAC clones. M molecular marker, E, EcoRI, B, BamHI, H, HindIII.

239

Rhamnospondin is part of a gene family The number of Rsp loci present in the H. symbiolongicarpus genome was investigated by Southern hybridization analysis on both positive BAC clones and genomic DNA (Fig. 2). EcoRI was predicted to cut twice (positions 344 and 3,765 of genomic DNA), BamHI once (position 1,831), and HindIII twice (positions 4,925 and 5,295) in the Rsp genomic DNA. Using a single locus model to predict hybridizing fragments from the following Southern blot analyses, a single band at ≥1.4 kb is expected from EcoRI digestion, a single band at ≥3 kb from BamHI digestion, and from HindIII one band at 370 bp and a second at ≥5 kb. DNA from BAC clone 106O14 was digested with EcoRI, BamHI, and HindIII and hybridized with the 321-bp RBL cDNA used to screen the BAC library (Fig. 2a). EcoRI digestion resulted in two strong hybridization bands of ∼6 and ∼2.8 kb and one weakly hybridizing band at ∼3.5 kb. BamHI digestion yielded two hybridizing bands of ∼7 and ∼4.5 kb. HindIII digestion resulted in only one band of ∼8 kb, although the expected 370-bp fragment from this digest likely ran off the gel and was not detected. Digestion of an additional 16 positive BAC clones with EcoRI and BamHI gave identical results to those of clone

Sizes are indicated in Kb. b Genomic DNA from individual WH46 was digested with BamHI (B) and EcoRI (E), then probed with a 359bp RBL/TSR8 domain probe from Rsp. Hybridizing fragments are indicated by arrow heads. DNA size marker positions (Roche Diagnostics) run in the same gel are marked

240

Immunogenetics (2007) 59:233–246

Fig. 3 Dotplot analysis comparing exon 11 (E11), intron 11 (I11), and exon 12 (E12) of Rsp1 and Rsp2. Rectangles represent exons and the connecting line represents the intron. Numbers indicate nucleotide positions. Note the interruption in the identity line at the intron– intron comparison

106O14. Figure 2b shows the hybridization pattern from genomic DNA of colony WH46 digested with EcoRI and BamHI and hybridized with the 359-bp RBL/TSR8 cDNA probe. BamHI digestion shows two hybridizing fragments, and EcoRI shows three hybridizing fragments, similar to the results from BAC hybridization analysis. The hybridization patterns from the EcoRI and BamHI digests exceed a single locus model for Rsp and are consistent with the presence of two tightly linked Rsp loci. As two different individuals were used for the BAC and genomic DNA hybridizations, the difference in the size of the middle band in the EcoRI digestion between these two samples is likely explained by sequence polymorphism of the Rsp genes. To investigate further the presence of more than one Rsp gene, a fragment spanning exon 11, intron 11, and exon 12 was amplified (primers 4F and 1R) and cloned from four positive BAC clones (106O14, 94M20, 134O16, and 150B11). Two different Rsp sequences were obtained from each BAC clone. One was identical to the fully sequenced Rsp gene, whereas the second was 94 and 74.4% similar to it at exons 11 and 12, respectively. Introns, however, differed in size by 34 bp and substantially in sequence preventing a meaningful alignment (Fig. 3). Thus, Southern hybridization and sequencing of PCR products from BAC DNA indicated the likely presence of at least two Rsp genes

in H. symbiolongicarpus, hereafter referred to as Rhamnospondin-1 (Rsp1) and Rhamnospondin-2 (Rsp2). Rhamnospondin genes are polymorphic Using generic primers that amplify both Rsp loci (2F and 2R, Fig. 1a), a 866-bp cDNA fragment predicted to encode the last 22 amino acids of TSR5, and complete TSR6, TSR7, TSR8, and RBL domains was amplified, cloned, and sequenced in 18 individuals. Three of the animals used were derived from the inbreeding program to map an allorecognition complex in Hydractinia (Cadavid et al. 2004), whereas the other 15 were wild-type colonies collected from two different locations in northeastern USA. The fidelity of polymerase used in the amplification process of Rsp was estimated using the actin gene by amplifying, cloning, and sequencing a 994-bp cDNA fragment from 22 plasmid clones using a single individual (35B). Eight errors were found out of 21,203 bps total, yielding a polymerase error rate of 3.77×10−4 errors/kb. In addition, no evidence of recombination events was detected from actin gene amplification. Based on this estimate and on the well-demonstrated fact that PCR generates artifactual recombinants of polymorphic and/or tandemly organized genes (Shafikhani 2002; Zaphiropoulos 1998), cDNA

Immunogenetics (2007) 59:233–246

241

Fig. 4 Alignment of variable nucleotide positions of Rsp1 and Rsp2 alleles identified in a sample of 18 Hydractinia colonies. Domains and positions are indicated at the bottom; dots denote identity to the first sequence. Alleles are designated by two- or four-digit numbers. Alleles predicted to encode the same peptide share the first two digits

of their four-digit designation, whereas alleles with no silent counterparts have a two-digit designation. The right-hand panel indicates presence (filled squares) or absence (open squares) of a particular allele for each individual of the sample

variants were only considered if they met two or more of the following criteria: (1) identical sequence in at least two plasmid clones per individual, (2) present in more than one individual, and (3) display no recombination between Rsp1 and Rsp2. Applying such criteria, 14 Rsp1 and 11 Rsp2 alleles were identified (Fig. 4). Both Rsp1 and Rsp2 alleles were predicted to encode eight different peptides. Allele Rsp1-06 had a nonsense mutation at position 1,401 (TSR6). As two other Rsp1 alleles were identified in the same individual (35B), this null allele might represent an additional, nonfunctional Rsp locus. Yet, this explanation seems unlikely as the BAC clone and whole genomic DNA hybridization analyses were consistent with a two-locus model. More likely, it might represent an amplification artifact occurring early in the PCR cycling. Nucleotide substitutions were distributed throughout the studied domains (Fig. 4). A total of 30 polymorphic positions were

found in Rsp1, 4 of which (13.3%) were at the predicted TSR5, 5 (16.6%) at TSR6, 6 (20%) at TSR7, 7 (23.3%) at TSR8, and 8 (26.6%) at RBL. In turn, 27 polymorphic positions were found in Rsp2, with 1 (3.7%) at predicted TSR5, 1 (3.7%) at TSR6, 7 (25.9%) at TSR7, 9 (33.3%) at TSR8, and 9 (33.3%) at RBL. In comparisons among alleles at the two loci, the mean number of synonymous substitutions per synonymous site (dS) generally exceeded the mean number of nonsynonymous substitutions per nonsynonymous site (dN; Table 1). There was a significant difference between means dS and dN estimated for the entire available sequence of Rsp1 alleles, as well as in TSR7 and in the combined TSR repeats (Table 1). Likewise at Rsp2, there was a significant difference between means dS and dN in the combined TSR repeats (Table 1). In comparisons between the two loci, mean dS was significantly greater than mean dN overall, in

Table 1 Mean percent of synonymous substitutions per synonymous site (dS) and of nonsynonymous substitutions per nonsynonymous site (dN) in Rsp sequences Rsp1

TSR6 TSR7 TSR8 All TSR RBL All a

Rsp2

dS±SE

dN±SE

dS±SE

dN±SE

dS±SE

dN±SE

1.8±1.1 4.4±2.1 1.1±0.8 2.3±0.8 0.3±0.3 2.6±0.7

0.1±0.1 0.1±0.1a 0.8±0.4 0.4±0.2a 1.0±0.6 0.5±0.2b

0.0±0.0 3.2±1.6 2.1±0.9 1.7±0.5 1.0±0.6 0.5±0.2

0.1±0.2 0.3±0.2 0.6±0.3 0.3±0.1a 0.8±0.3 0.4±0.1

13.9±6.4 32.2±11.5 23.8±8.4 22.6±4.7 8.3±4.5 19.0±3.2

2.8±1.1 0.6±0.4b,d 1.1±0.5b,e 1.5±0.4a,f 4.4±1.1 3.0±0.7c

Tests of the hypothesis that dS equals dN, P