Cloning and Developmental Expression of the a3 Chain of Chicken ...

2 downloads 0 Views 5MB Size Report
Nov 26, 1991 - Mayne, 1987), while the pool of extracted bovine molecules is minimally ...... A Practical Approach (Glover, D. M., ed) Vol. 1, pp. 49, IRL Press,.
THEJOURNALOF BIOLOGICAL CHEMISTRY

VoI. 267,No. 14, Issue of May 15,pp. 10070-10076,1992 Printed in U.S.A.

0 1992 hy The American Society for Biochemistry and Molecular Biology, Inc.

Cloning and DevelopmentalExpression of the a3 Chain of Chicken Type IX Collagen* (Received for publication, November 26, 1991)

Ronit Har-ElSB, Yagya D. Sharmasn, Angelica AguileraSII, Naoki Ueyama$**, Jiann-Jiu WusS, David R. Eyre$$##,Ljiljana Jurii?ii$, Subramaniam ChandrasekaranS,Ming Lis, Hyun-Duck NahS, William B. UpholtSnll, and MarvinL. TanzerS From the $Department of BioStructure and Function, School of Dental Medicine, Universityof Connecticut Health Center, Farmington, Conneticut 06030 and the Departments of $$Orthopedics and §§Biochemistry, University of Washington, SeattLeWashington 98195

Fibrous and nonfibrous collagens comprise two ma- including the developing limb and is not detected in jor groups within the collagen family and both groups other tissues or in the precondensation stage of limb are found in a diverse variety of tissue fabrics. Type development. The composite data delineate the priIX collagen is in thenonfibrous group; three different mary structure of the a3 chain of chicken type IX subunits of type IX collagen have been identified and collagen, show its close relationship to the a1 and a2 the a1 and a2 subunits havebeen cloned. Using molec- chains, demonstrate its mRNA transcript, and show ular cloning methods we have isolated, from an embry- the appearance of that transcript intissues of the deonic chicken cartilage library,cDNA clones which code veloping chick embryo. a3 chain of chicken type IX collagen. The for the entire cDNA clones encompass 2416 base pairs which have a conceptual open reading frame for a protein containing 675 amino acids including 193 Gly-X-Y repeats. These The ubiquity and diversity of collagenous structures collagen repeats are in three separate domains which throughout the animal kingdom emphasize collagen’s imporare interspersed with four major noncollagen domains. tance in providing extracellular frameworks for the tissues The collagen repeats also have four minor interrup- and organs of all multicellular animals. These frameworks tions. This chain organization directly aligns both with assume a variety of architectural forms and the constituent the a1 and a2 chains of chicken type IX collagen. collagen molecules themselves show remarkable variation in Comparison of the deduced amino acid sequence with size, overall molecular shape, and chemical composition peptide sequences of type IX collagens shows identity (Burgeson, 1988; van der Rest and Garrone, 1991). Current with 95 of the 96 known residues of the chicken a3 classification of collagenous proteins assigns them to two chain and 81 of the 98 known residues of the bovine distinct groups, the fiber-forming and non-fiber-forming mola3 chain. The identical residues match those in five peptide fragments, two from the bovine protein and ecules (Sandell and Boyd, 1990; Vuorioand de Crombrugghe, three from the chicken protein. Thechicken and bovine 1990). Type IX collagen is a prototypeof the lattergroup and a3 chains haveconserved cross-linking sites, separated it exhibits some unusual features. It is modified by a single by 137 residues which span 40 nm, the length of the chondroitin sulfate glycosaminoglycan chain which can vary hole zone in a collagen fibril. The NC3 domain of the in length, depending upon the tissue source of the type IX chicken a3 chaincontains a repeatCys-Pro motif collagen (Brewton et al., 1991; Yadaet al., 1990). The relative which is present in both vertebrate and invertebrate degree of glycosaminoglycan chain addition may vary with nonfibrillar collagens. Northern blot hybridization ex- metabolic turnover, since the newly synthesized chicken type hibits a major mRNA of about 3.3 kilobases; this tran- IX collagen is almost completely modified (van der Rest and script is found in cartilaginous tissues in the embryo, Mayne, 1987), while the pool of extracted bovine moleculesis minimally modified (Ayad et al., 1989). Another feature is *This workwas supported by Grants AR-12683 (to M. L. T,), that the d ( I X ) gene has two promoters which are preferenHD22896 (to W. B. U.),DE 07131 (to H. D. N.), and HD-26610 (to tially utilized in different tissues (Nishimura et al., 1989) and M. L. T. and W. B. U.) from the National Institutes of Health. The which give rise to proteins with different amino-terminal costs of publication of this article were defrayed in part by the to containing payment of page charges. This articlemust therefore be hereby globular domains. Furthermore,inaddition marked “advertisement” in accordance with 18 U.S.C. Section 1734 intramolecular disulfide cross-links, type IX molecules are cross-linked, via aldehyde-derived moieties, to the surface of solely to indicate this fact. The nucleotide sequence(s1 reported in thispaper hos been submitted fibrils comprised primarily of type I1 molecules (Eyre et al., totheGenBankTM/EMBLDataBankwith accession number(s1 1987; van der Rest and Mayne, 1988). This latter observation M83179. has inspired suggestions for the biological role(s) of type IX Present address: Dept. of Cellular Biochemistry, Hebrew Univercollagen, including interactions between type IXcollagen and sity of Jerusalem, Israel 91010. 11 Present address: Dept. of Biotechnology, All India Institute of other components of the extracellular matrix, e.g. proteoglycans. Also, type IX collagen may limit the diameters of type Medical Sciences, New Delhi-110 029, India. 11 Present address: Fels Institute, Temple University, Philadelphia, I1 fibrils by inhibiting their further expansion, thereby conP A 19140. trolling the range of fibril diameters. ** Present address: Dept. of Orthopedic Surgery, Jikei University The molecular architecture of type IX collagen molecules School of Medicine, Tokyo, 105, Japan. llll To whom correspondence should be addressed Dept. of Bio- has been deciphered by a combination of methods, including rotary shadowing in conjunction with specific antibodies Structure and Function, University of Connecticut Health Center, (Vaughn et al., 1988), separation of component chains by Farmington, CT 06030. Tel.: 203-679-3388; Fax: 203-679-2910.

10070

Avian a3(IX)Collagen cDNA chromatographic means (Noro et al., 1983), peptide sequence analysis (van der Rest et al., 1985), and molecular cloning (Ninomiya and Olsen, 1984). The composite data have provided a generally accepted model of type IXcollagen in which three genetically distinct collagen chains, a l , a2, and a3, are organized in register, each with its amino terminus at the same end of the molecule (Shimokomaki et al., 1990). In some tissues, the a1 chain has a large globular extension at its amino terminus, in contrast to the other two chains (Vasios et al., 1988). One of the features of type IX collagen which remains to be deciphered is the complete primary structure of its a 3 chain. Peptide sequences have been reported for this subunit (van der Rest et al., 1985), including sequences encompassing cross-linking sites for attachment to type I1 collagen (Eyre et al., 1991a; Shimokomaki et al., 1990). In the present study we describe the molecular cloning of the cDNA for the a 3 chain of chicken type IX collagen. The cloned cDNA almost completely matches the avian peptide sequences and shows strong identity with the bovine peptide sequences. An unusual feature of the molecular cloning study was that we initially isolated and sequenced a putative type IX clone from an annelid library. However, further investigation revealed that this clone was most likely an avian contaminant in the annelid cDNA library. The basis for this conclusion is described later in this paper. EXPERIMENTALPROCEDURES

Materials-The marine annelid, Nereis uirens, was obtained from a local bait shop in Niantic, CT, and transported to the laboratory. Chicken embryos, 14 days old, were received from the University of Connecticut Agricultural Station, Storrs, CT. Reagents for biochemical and molecular cloning experiments were of the highest quality available from commercial vendors. Bovine Type I X Collagen Peptide Isolation and Sequence Determination-Type IX collagen was extracted by mild pepsin digestion from fetal calf epiphyseal cartilage and partially purified by salt precipitation (Wu and Eyre, 1984). A portion of one preparation was reacted with tritiated sodium borohydride to label the reducible crosslinking residues (Wu and Eyre, 1989). Another preparation, which sodium dodecyl sulfate-polyacrylamide electrophoresis showed was especially enriched in partial cleavage products that contained uncleaved NC3 domains, was reduced and carboxymethylated with [“C] iodoacetate to label cysteine residues. The reduced and alkylated individual HPLC’ chains were separated by reverse phase HPLC and each chain was digested with either trypsin or cyanogen bromide. Individual peptides containing 3H-labeled reducible cross-links or 14C-labeledcysteines were purified by ion-exchange and reverse phase HPLC and identified by protein microsequencing on a Porton 2090E gas-phase sequenator with on-line phenylthiohydantoin-derivative identification. Recombinant DNA Methods and Nucleotide Sequence AnalysisLive annelids or live chicken embryos were killed by decapitation. In the case of the annelids, the cuticular exterior was dissected from the rest of the body whilein the case of the chickens, the sternalcartilage was dissected free. Both the cuticle and the sternal preparations were immediately homogenized in guanidinium thiocyanate (Han et al., 1987), followedby processing to recover total RNA. The poly(A+) fraction of the annelid RNA was recovered following oligo(dT) chromatography and used to construct a cDNA library in XORF 8 bacteriophage (Meissner et al., 1987). The chicken poly(A+)was isolated by a similar procedure by Clontech (La Jolla, CA) and cloned into the phage X g t l O (Huynh etal., 1985). The annelid recombinantswere screened with a probe derived from the plasmid pJJ103, containing a C. elegans collagen gene, col-I, kindly provided by Dr. D. Hirsh, University of Colorado (Kramer et al., 1982). The derived probe, called HBl-D550, was obtained by sequential digestions with restriction endonucleases Hind111 and DdeI; the fragment contained 550 base pairs which primarily code for the collagenous domain of col-I. Positive plaques were purified, their inserts released from the vector, and the inserts were further characterized by cleavage with the The abbreviation used is: HPLC, high performance liquid chromatography.

10071

restriction endonuclease Sau96I; those clones which yielded a typical “collagen ladder” sequence (Vasios et al., 1987) were tentatively identified as containing collagenous sequences. One such clone, termed RLTl (Fig. lB),was subcloned into phages M13 mp18 and M13 mp19 (Yanisch-Perron et al., 1985) for single strand nucleotide sequencing of the sense and antisense strands (Sanger et al., 1977). Full sequencing of both strands was carried out using synthetic oligonucleotide primers. Following the determination of the sequence of the annelid clone, probes A and B (Fig. 1) were prepared by the polymerase chain reaction and used to screen a chicken embryo cartilage cDNA library, prepared in X g t l O . Approximately 1 out of 2 X lo3 plaques were positive. Three X g t l O clones (CCOL9A3a-c,Fig. IC) were chosen for further characterization and their EcoRI fragments were subcloned into Bluescript SK-. The 400-base pair EcoRI fragment at the 3’ end of clone CCOL9A3b, namely probe C (Fig. 1A) was isolated from low melting temperature-agarose, labeled by a random primer procedure (Feinberg and Vogelstein, 1984), and used to screen the same library to obtain clones which extended further in the 3’ direction. Once again, about 1 out of 2 X lo3 plaques were positive. Southern blots of EcoRI fragments of selected clones were hybridized with probe C, and two clones, CCOL9A3d and CCOL9A3e (Fig. l D ) , containing the longest EcoRI fragments, were selected. These two hybridizing EcoRI fragments were subcloned into Bluescript SK- and sequenced. All clones isolated from the chicken embryo cDNA library were sequenced by the dideoxy chain termination method using double stranded templates. All regions of the cDNA were sequenced at least once on both strands. Analysis of and comparison of nucleic acid and protein sequences from both chicken and annelid clones was done using the computer programs provided by Genetics Computer Group software (Devereux et al., 1984). 20 fig of total RNAwas Northern BlotHybridization-About separated in 1%agarose gels in the presence of 0.66 M formaldehyde. The RNA was transferred to a nitrocellulose filter and hybridized to probe D (Fig. 1). Probe D had been prepared by polymerase chain reaction followed by labeling the strandcomplementary to themRNA with [32P]dCTPusing the 3’ polymerase chain reaction primer with the Klenow fragment of DNA polymerase I. Filters were prehybridized for 3 h and then hybridized overnight at 42 “C with the probe in 50% formamide, 0.1% sodium dodecyl sulfate, 5 X SSPE (0.75 M NaCI, 50 mM NaH2P04, and5 mM EDTA), 5 X Denhardt’s solution, and 100 yg/ml salmon sperm DNA. They were washed three times, 10 min each time, a t room temperature with 2 X SSC, then once, 1h at 65 “C with 0.5 X SSC followed by washing twice at 65 “C with 0.1 X SSC for 45 min each time. Autoradiographic exposure of the x-ray film was from 2 h to 5 days at -70 “C. RESULTS

Clonal Isolates and Their Relationships-Screening of the annelid library was done with the 550-base pair probe which codes primarily for collagen sequences and was obtained from C. elegans col-1as described above. Twopositive recombinants were isolated and purified; restriction enzyme mapping showed they were virtually identical. The inserts from the clones exhibited a typical collagen ladder upon cleavage with the restriction enzyme Sau961, providing independent confirmation that they contained sequences which potentially code for Gly-X-Y repeats (Vasios et al., 1987). The insert RLT1, about 1.4 kilobase pairs, was isolated, purified, and subcloned for sequence characterization (Fig. 1).This clone was used to screen 4 x lo5 plaques of the chicken library and approximately 200 strongly positive plaques were readily detected. Three of the more reactant plaques were lifted and re-screened until individual clones were obtained. Since theseinitial clones (Fig. 1) did not extend to the termination codon, the 400-base pair EcoRI fragment atthe 3’ end of clone CCOL9A3b was used as a probe to rescreen the library as described above (“Experimental Procedures”). The resultant clones are described below. Nucleotide Sequence and Deduced Amino Acid SequenceSequence analysis of the clones in Fig. 1 yielded an open reading frame encompassing 2416 base pairs which provided a deduced amino acid sequence of 675 residues (Fig. 2), and included 22 base pairs of 5’-nontranslated sequence, as well

10072

FIG.1. Alignment and sequencing strategy of the cDNA clones for the chicken aS(1X) chain. The relative sizes and location of the clones are indicated as is the sequencing strategy. Clone RLTl was obtained from the annelid library while the other clones were from the chicken library. A, composite RNA showing the translation start site, AUG, collagenous domains as shaded boxes joined to noncollagenous domains, andthe termination codon, Term.B, clone RLTl from the marine annelid cDNA library. C, clones isolated from the chicken cDNA library using probes A and B. D, clones isolated from the chicken cDNA library using probe C. Vertical lines on clone CCOL9A3b-e indicate EcoRI cleavage sites within the cDNAs which were used for subcloning.

Avian (u3(IX)Collagen cDNA

RLTl

CCOLBA3a

CCOL9A3b

CCOLBA3c

D

CCOLBA3d

CCOLBA3a

FIG. 2. Nucleotide and deduced amino acid sequences of chicken aS(1X) cDNA. Boxed amino acids represent noncollagen interruptions of the Gly-X-Y repeat sequences. The circled lysine residues correspond to potential or identified cross-link sites, as described in the text. The underlined sequences correspond to peptide sequences obtained from the chicken (u3(IX) chain (Mayne et al., 1985; Shimokomaki et al., 1990; Gordon etal., 1987). The arrow indicates the putative cleavage site of the signal peptide, as predicted using the methods of von Heijne (1983). Positions of single nucleotide differences between the various clones are: 1003 where clone CCOL9A3a has A in place of G, 1018 where clones RLTland CCOL9A3b have T in place of C, 1080 where clone CCOL9A3a has G in place of T, 1120 where clone CCOL9A3c has T in place of C, 1178 where clone CCOL9A3c has C in place of G, and 1665 where clone CCOL9A3b has G in place of T. These differences were confirmed by sequencing both strands of all clones in these regions.

2411

GAATRI

2416

as the termination codon plus 366 base pairs of 3"nontranslated sequence. The major theme in this deduced sequence is that of a collagen, with 193 Gly-X- Y repeats sequestered in three domains, and four major noncollagenous interruptions

(Fig. 2; Table I). In addition to the major noncollagen domains, known as NC1 through NC4, there are four shorter interruptions, 3 of 2 amino acids each and 1of 3 amino acids. The COLl domain contains two of the interruptions while

37

Avian a3(IX) Collagen cDNA

10073

TABLE I Comparison of type IX collagen domains in each chain Domain

NC1 COLl 30 NC2 COL2 17 NC3 137COL3 NC4

a1 chain”

17

21h 115 30 339 12 137 243’

a2 chain”

15 115 339

3’

a3 chain

112 31 339

3’

“The data for the n l and n2 chains are from Ninomiya etal. (1990) and the cartilage-related form of the n l chain is used for comparison. Number of residues per domain. Number of residues, less the presumptive signal peptide. Each of the three chains has a unique deduced signal peptide in the NC4 domain; it is tentatively assigned as 23residuesfor t h e n l chain (Ninomiya et al., 1990). and as 21 residues for the other two chains. An alternative form of the a1 chain has2 residues in the NC4 domain and a unique signal peptide of 23 residues (Nishimura et al., 1989).



FIG.3. Northern blot analysis of 20 pg of total RNA obtained from the sternal cartilage of 14-day-old chicken embryos. The sizes of the hybridizing transcripts were determined from RNA markers, 2.3, 4.4, 7.4, and 9.5 kilobases, in a parallel lane. The probe used for hybridization spans nucleotides 1643-2378 in Fig. 2 and is identified as probe D in Fig. 1. This probe was chosen to contain a high proportion of noncollagenous and noncoding sequences (55% of total bases) to minimize nonspecific hybridization to other collagen-encoding mRNAs.

COL2 and COL3each contain one interruption. Thededuced 22 25 21 25 28 31 protein contains 23 Gly-Pro-Pro sequences, distributed as 12 in the COL3 domain, 8 in the COL2 domain, and 3 in the COLl domain. It also has two Gly-Gly sequences, at residues - 95 - 7.5 129-130 and 195-196. There is a single consensus sequence for N-linked carbohydrate substitution a t residue 479. There I I are threelocations of the sequence (underlined, Fig. 2) where - 2.4 v peptide sequences of the chicken a3(IX) chain areavailable and these sequences closely match the deduced amino acids. - 1.4 T h e largest of the peptides has 56 reported residues, of which 55 match thededuced sequences. The mismatch isGlu in the . peptide sequence and Pro (residue 136) in the deduced seA B quence. The peptide sequence shows no amino acid assignFIG.4. Northern blot hybridization of total RNA obtained ment ata potential cross-linking site; it isLys in thededuced from limbs of stage 21,22, 25, 28,and 31 chicken embryos. sequence. The second largestpeptide,spanningNC2and COL1, shows complete identity of the 32 identified amino Panel A , 40 pg of total RNA from stage 22 and 25 limb buds were hybridized and the autoradiographic exposure was for 3 days. Panel acids with thederived sequences. The smallest peptideshows B, 20 pg of total RNA from stage 21, 25, 28, and 31 limb buds were that 9 of 9 amino acids match thededuced sequence of two of hybridized and the autoradiographic exposure was for 3 days. The the clones. However, in a third clone, namely clone probe used was the same as for Fig. 3. CCOL9A3b, nucleotide 1665 is a G in place of a T and thus encodes an Arg in place of the Met found in the peptide. due to cross-hybridization, since low stringency conditions Thus, two of three clones completely agree with the smaller were required to elicit the signals. The most compelling reapeptide sequence, with one clone differing a t 1 amino acid sons to assign the clone to an avian source are that: l) its position. At five additional positions there are single nucleo- nucleotide sequencesare identical with independently isolated tide differencesbetween the variousclones, two of which clones of avian origin; 2) only a single such clone was found result in amino acid changes in position Y of the Gly-X-Y in the annelid library a n d 3) the deduced amino acidsetriplet. These positions areidentified in Fig. 2. quences are virtually identical with the correspondingseHybridization Studies-Using probe D (Fig. lA)under high quences of an avian peptide as well as being highly conserved stringency conditions (“Experimental Procedures”), a single when compared to two bovine peptides (see below). Moreover, 3.3-kilobase transcript was detected in hybridization studies the overlapping avian clones, CCOLSA3d and CCOL9A3e, of chick embryonic RNA (Fig. 3). This transcriptwas seen in near the 3’ end of the cDNA show identity of their deduced hybridization analyses of cartilaginousembryonicchicken amino acids with two other avianpeptides. We thus conclude tissues, namely sternum and late stage limb buds (Fig. 4) but that the clone we selected from the annelid library, using a not inembryonic skin, muscle, liver,heart, brain,or calvarium nematode collagen probe, was of avian origin. It is not clear (data not shown). Precartilaginouslimbbud,prior to the which portion of the original nematode probecross-hybridized condensation phase (stage 21),lacked a signal which then to theselected clone. appeared at the condensation phase (stage 22); subsequently, It is unlikely that the cloned fragments shown in Fig. 1 the developing limb showed progressive increases in the hy- encompass the entire mRNA because Northern hybridization bridization signal (Fig. 4). analysis shows an mRNAof 3.3 kilobases, whereas the cloned sequences account for about2.4 kilobases of the mRNA. One DISCUSSION contribution to this discrepancy is that the 3’ ends of the X The original cloned cDNA was obtained from the annelid clones CCOL9A3d and CCOL9A3e have not been isolated library (Fig. 1) and initially we assumed that this clone was of annelid origin. That assumption was reinforced by hybrid- and sequenced; those 3‘ ends were released by cleavage of ization of the clone to annelid mRNA andby in situ hybridi- internal EcoRI sites in the clone and flanking sites in the zation results.2 In retrospect, we believe these results were vector, and they have not yet been recovered for additional sequencing. Another indication of an incomplete nontranslated 3’ sequence is that consensus sequences for a polyade*R. Har-El, H.-D. Nah, L. JuriEiC, and M. Tanzer, unpublished a poly(A) tail. data. nylation signal are not seen, nor is there

Avian a3(IX) Collagen cDNA

10074

The composite data shown in Figs. 1 and 2 illustrate the cDNA encoding for the entire amino acid sequence of the avian a3 chain of type IX collagen. The deduced amino acid sequence contains seven domains equivalent to those seen in both the a1 and a2 chains (Figs. 5 and 6, Table I). Moreover, there is alignment of the domains ineach of the threechains, although the most amino-terminal domain is greatly extended in one particular form of the a1 chain. That chain displays at least 4 different mRNA species, partially related to tissue source, which result from two alternative promoters (Nishimura et al., 1989) and from two differing sites of polyadenylation (Svoboda et al., 1988). Since type IX molecules appear t o alwaysbe heteropolymeric, coordinated regulation of expression of the three constituent chains would be anticipated. This conjecture can be tested now that each of the genes has been identified. Pairwise amino acid sequence comparisons between all combinations of the a l , a2, and a 3 chains showed that they ranged between 49 and 55% identity (data not shown). Although the overall distribution of domains is virtually identical for the three type IX chains (Figs. 5 and 6) there are important differences. Table I shows that COL3 and COL2 are identical in size for the threechains while COLl is shorter in the a3 chain. The four major noncollagenous domains differ in size, comparing the three chains, with the exception that NC4 comprises only 3 amino acids in a2 and a3, while NC2 is the same size in a1 and a 2 chains. Since there is register of the collagenous domains to form a collagen triple helix, the noncollagenous domains must accommodate the resultant molecular alignment. One index of such accomodation is that a bend occurs in the molecule at theNC3 domain (Irwin et al., 1985). That domain shows more variation insize than NC2, the other internal noncollagenous region. Continuity of the triple helices within each of the collagenous domains is interrupted, as shown in Figs. 2, 5, and 6. Moreover, the interruptions in the COLl and COL3 domains are found in parallel locations in all three chains, except for the a1 chain in COLI, where it is displaced by one triplet (Fig. 5). The discontinuity, ARM, seen in COL2 of the a3chain is

unique to this type IX subunit and it is uncertain if it might cause local interruption in triple helix formation. The composite results indicate that all three chains contain both the major NC1 through NC4 domains as well as shorter interruptions of the Gly-X- Y repeats. The distribution of Gly-Pro-Pro sequences in the three chains seems to fall into a pattern inwhich they often cluster and bracket the discontinuities in the triple helix (Fig. 5). Thus, such bracketing of Gly-Pro-Pro clusters, encompassing all three chains, is seen at the junction between NC4 and COL3, on both flanks of the interruption in COL3, at the junction between COL3 and NC3, and at the junction between NC2 and COLI. Most likely the increased abundance of GlyPro-Pro in those locations is related to its known ability to impart stabilityto thetriple-helical conformation (Bhatnagar and Rapaka, 1976). Comparison of the amino-terminal sequence of a 3 with the a 2 chain shows an equivalent putative signal peptide of 21 amino acids, followed by a potential peptidase cleavage site between A and Q, as seen in both the a2 and a1 sequences (Fig. 5). There is positional identity of the deduced protein with 95 of 96 amino acids of the peptide sequences of the chicken a3(IX) chain (Fig. 2), and with 81 of 98 amino acids of the bovine chain (Fig. 7), encompassing two separate loci in both cases. One of the loci of identity of the deduced sequence with the chicken protein includes a cross-linking site, while the other locus is near the carboxyl terminus of the protein. In the bovine protein, each of the loci include a lysine-derived cross-linking site, corresponding to amino acid residues 188 and 326 of the chicken a3 chain (Fig. 2). Those loci are separated by 137 amino acids in the bovine chain, which span the length (40 nm) across the hole zone of a collagen fibril, as mapped by the cross-link attachment sites to thecollagen al(1I) chain (Eyreet al., 1991a). It is noteworthy that, in the cloned avian protein there arealso 137 amino acids between these loci, excluding the equivalent lysines (Figs. 2 and 6). Thus, the 2 lysines are precisely conserved'in location in the collagens of both species, along with most of the adjacent amino acids. The high degree of homology be> on

a3 a2

101

200

201

300

a1

PGEIGKEGEK GSPGPFGPPG IFGSVGLQGP RGLRGLPGPH GPAGDRGDIG F R S I F G L P G R A G D X N K GWGFRGPKG DTGRPGPKCN EGARGLIGEP LGK -Q G P K ~ E E GEQOVPCP~FGPQGQRGYPGMA GPKGETGPAG YKGMVGTIGA AGRPGREGPK RFGDPGEKG ELGGRGIRGP QGDIGPKGPI SGELGKQXK GEEGGGGPIG EVGACGPLGI P G I R G I T G I T G P K G N K W G LDXFGPQOL FGApGGQGQn GPVGmGPKG ERGPQOTRGI NGLFGPKGES

a3

GIPGKOGI~DG APGLEGEKGD

301 ~+&VFGE@

GPNGLPGLPG RAGIKGSKGE

PGSPGEMGE~~ GPSGEPGIPG WGIPGDRGL

400 PGPRGATGW GLPGPIGAFG

a2 GLFGIDOKOG TFGIPWKGT AGQPGRPGPP GHRGQAGLPG QPGSKOGPGD KGEVGARGQQ GITGTPGLOG EPGPFGDAGT AGVPGLKGDR GERGWGAFG a1 GLPOWOREO IPOHPGAKGE PGKPGTPGC OP~GLPGLPG SPGMKGIFGP KW%PGVP GLMGNSGKPG EQGTEGE~~GP T G P R ~ P G S RGEPGPAEPG a", 500 SGEFGLPGET GIRGESGDRG PAGVIGAKGS QGIAGADGLP G D K G E L G P P ~ G Q K G E P G K RGELGPKGAQ GPNGTAGAPG IPGHPGPMGH EAGQSGPKGE m P G I P G 3 GLEGVKGDKG SPGKTGPKGS TGDPGVHGLAOVKGEKGESG EPGPKGQCGI QGELGLPGPS GDAGSPGVRG YPXPGPRGL al LPGKWGPQOD I G L P G L P G S GLFGGKGWG SAGEFGPKGE QOAPGSEGDA GEKGDLGDHG LPGAKGSVGN PGDPGSRGPE GSRGLFGMEG PRGAWPRGL

a3 a2

VROPQGPKGA 501

600

RPGP AGPPGPFGAT GWGHEGARUPGYRGETGE IZDPGPRGDT PFGP P-PGEQ

GWGPMGPRG LPGLLGAAGQ IGNIGPKGKR R P G a G A P G p p G E N GPFGQLCIPRG L P G L K W G E IGFXGPKGEA

FIG. 5. Comparison, by computer program alignment, of the cloned chicken a3 chain withthe other two chains of chicken type IX collagen (Nishimura et al., 1989; Ninomiya et al., 1990).The PILEUP program (Devereux et al., 1984) of the Genetics Computer Group software package was used for the comparisons. In the case of the a1 chain, the corneal form of this chain, which includes only a signal peptide and the 2 amino acids adjacent to COL3, was used for comparison. The noncollagenous domains and interruptions are boxed, potential lysine (K) cross-link sites are circled, the Gly-Pro-Pro triplets are underlined, and the circled serine (S) in the a2 chain is the siteof glycosaminoglycanattachment. The numbering system in this figure differs somewhat from Fig. 2 in order to delineate the comparison between the threechains.

e

Avian a3(IX) Collagen cDNA

10075

1

FIG. 6. Schematic diagram of the three subunits of type IX collagen, 1 drawn to scale with equal numbers of amino acids per linear dimension (except for NC4 of a2 and a3 chains which longer than the scale). The narrow shaft arrows designate interruptions inthe triplet repeat which occur in all three chains. while the wide shaft arrows designate both actual and poteitial (by analogy to bovine type IX collagen) cross-linking sites.

:.

'

:

'..

'

..

1 1 :

H.:

'

..

.:

.

:

:. ..

: '

.. .

. .

. :

... ..

":H.', .

1

1

,

.:.. ,. ..:.:

+

1 1

-I

H 1

1

I

1 1

~~

NC4

135 Chicken ~3 Bovine .A Chicken a 3 Bovine 03

Chicken a3 Bovine 03

FIG. 7. Comparison, by computer program alignment, of the cloned chicken cDNA with peptides from the bovine a3(IX) chain from this article (residues corresponding to numbers 135-185 of the chicken sequences) and from published work (Eyre et al., 1991a). Shading signifies identical residues. The BESTFIT program of the Genetics Computer Group software package was used for the comparisons. The numbers correspond to the numbered amino acids in Fig. 2. Note that the amino acid sequence of the larger bovine peptide shown here corresponds to both the chicken peptide sequence and the deduced amino acid sequence of the chicken subunit (Fig. 2). Chicken.3(IX) Bovine .3(IX) ChickenaZ(1X) Bovine.Z(IX) Chicken al(1X) Bovine ~l(1X)

Human a1 (IX) C. elegans COL-8 C. elegans COL-19

EGGGDLQ~~AL~P EGATDLQ~~AI~P

EGSADFL~TN~P *. FL@TN&IA i(i

QDGDPLB~NSBP DGDPL~&S@P +>> .

HDGDPL~~NARP PGGCIK~~PG~G PGGCIK~#PG'@G

FIG. 8. Comparison, by computer program alignment, of the NC3 domain of the chicken a3(IX) chain with similar domains found in nonfibrous collagens. Shuding indicates residues identical with those in thechicken a3(IX) chain. The comparative data are for the chicken a2(IX) (McCormick et al., 1987) and oll(1X) chains (Ninomiya and Olsen, 1984), human al(1X) chain (Muragaki et al., 1990), bovine a3(IX), a2(IX), andoll(1X) chains (Eyre et al., 1991a, 1991b), and theC. elegans col-8 and col-19 cuticle collagen sequences (Cox et al., 1989).

tween the avian and bovine sequences signifies substantial conservation of molecular structure, similar to some other members of the collagen family (Burgeson, 1988; van der Rest and Garrone, 1991). In thebovine case, the type IX collagen is cross-linked in an antiparallel arrangement with regard to type I1 fibrils (Eyre et al., 1991a);that ismost likely to be the case with the chicken type IX andtype I1 collagens as well. Within the NC3 domain two tandem Cys-Pro sequences, separated by 2 amino acids, are conserved, as seen in both invertebrate and vertebrate nonfibrillar collagens (Fig. 8). In the vertebrates each of the threechains of bovine and chicken type IX collagens contain this motif, as does the human d ( I X ) chain. In theinvertebrates, two different chains of the C. elegans cuticle collagen also contain it.This particular CysPro repeat was found in avariety of noncollagenous proteins; thus, it is not specific to thecollagens.

COL3

NC3

COL2

NC2

COLl

NC1

The relative abundance of the mRNA transcript for the chicken a3(IX) chain varies with stage of embryonic development and specific tissue of origin (Fig. 4). Only cartilagecontaining tissues, limb, and sternum, showed the mRNA transcript, while embryonic skin, muscle, liver, heart, brain, and calvaria were devoid of a signal. No signal was detected in the stage 21 limb bud which is the stage prior to cellular condensation. Coincident with cellular condensation a very faint signal was detected inthe limbs of stage 22 embryos and it markedly increased at stage 25, progressively amplifying at stages 28 and 31. This pattern of mRNA expression during progressive stages of prechondrogenesis and chondrogenesis is analogous to that seen for the mRNAs of the al(1X) and a2(IX) chains (Kulyk et al., 1991), as well as the mRNAs of the core protein of avian aggrecan (Kosher et al., 1986). The pattern differs from that seen for the expression of the mRNA for type I1 collagen; in that case, the mRNA is detectable prior to cellular condensation (Kulyk et al., 1991) and is also seen in cells which, in vivo, never undergo prechondrogenic condensation (Nah and Upholt, 1991). One implication of these comparisons is that the developmental program for expression of type I1 collagen significantly differs from the developmental programs for expression of type IX collagen and of aggrecan. Conceivably, thelatter group of matrix molecules may be regulated by similar, if not identical, developmental programs. The composite results provide further molecular definition of type IX collagen and delineate the complete primary structure of the a3 chain of chicken type IX collagen. Future studies of the gene itself should help define its overall structure and regulation. REFERENCES Ayad, S., Marriott, A., Morgan, K., and Grant, M. E. (1989) Biochem. J. 262, 753-761 Bhatnagar, R. S., and Rapaka, R. S. (1976) in Biochemistry of Collagen (Ramachandran, G. N., and Reddi, A. H., eds) pp. 481-483, Plenum Press, New York Brewton, R. G., Wright, D. W., and Mayne, R. (1991) J. Bwl. Chem. 266,4752-4757 Burgeson, R. E. (1988) Ann. Reu. Cell Biol. 4,551-577 Cox, G. N., Fields, C., Kramer, J. M., Rosenzweig, B., and Hirsh, D. (1989) Gene (Amst.) 76, 331-344 Devereux, J., Haeberli, P., and Smithies, 0.(1984) Nucleic Acids Res. 12,387-395 Eyre, D. R., Apon, S., Wu, J.-J., Erickson, L. H., and Walsh, K. A. (1987) FEBS Lett. 220,337-341 Eyre, D. R., Wu, J. J., and Woods, P. (1991a) in Articular Cartilage and Osteoarthritis (Kuettner, K. E., Schleyerbach, R., Hascall, V. C., and Peyron, J. G., eds) pp. 119-130, Raven Press, Ltd., New York Eyre, D. R., Wu, J. J., Woods, P. E., and Weis, M. A. (1991b) Br. J. Rheumtol. 30, Suppl. 1,lO-15 Feinberg, A. P., and Vogelstein, B. (1984) Anal. Biochem. 137,266267 Gordon, M. K., Gerecke, D. R., and Olsen, B. R. (1987) Proc. Natl. Acud. Sci. U.S. A. 84,6040-6044

10076

Avian a3(IX) Collagen cDNA

Han, J. H., Stratowa, C., and Rutter, W. J. (1987)Biochemistry 26, 1617-1625 Huynh, T. V., Young, R. A., and Davis, R.W. (1985)in D N A Cloning: A Practical Approach (Glover, D. M., ed) Vol. 1,pp. 49,IRL Press, Oxford Irwin, M. H., Silvers, S. H., and Mayne, R. (1985)J. Cell Biol. 101, 814-823 Kosher, R. A., Gay, S. W., Kamanitz, J. R., Kulyk, W. M., Rodgers, B. J., Sai, S., Tanaka, T., and Tanzer, M.L. (1986)Deu. Biol. 118, 112-117 Kramer, J. M., COX,G.N., and Hirsh, D. (1982)Cell 30,599-606 Kulyk, W.M., Coelho, C.N.D., and Kosher, R.A. (1991)Matrix 11, 282-288 Mayne, R., van der Rest, M., Ninomiya, Y., and Olsen, B. R. (1985) Ann. N. Y. A c d . Sci. 460,38-46 McCormick, D., van der Rest, M., Goodship, J., Lozano, G., Ninomiya, y., and olsen, B. R. (1987)proc,Natl, ad, sei. u. s.A. 84,4044-4048 Meissner, p. S., Si&, W. p., and Berman, M. L. (1987)proc. Natl. Acad. Sci. U. S. A . 84,4171-4175 Muragaki, y., ~ i T., ~ ~ i ~~y., and ~ ~olsen, ~ iB. ~ R.~ (1990) ,~ Eur. J. Biochem. 192,703-708 Nah, H.-D., and Upholt, W.B. (1991)J. Biol.Chem. 266, 2344623452 Ninomiya, y.9 and Olsen, B.R. (1984)h o c . Natl. sei. u. s.A . 81,3014-3018 Ninomiya, Y., Castagnola, P., Gerecke, D., Gordon, M. K., Jacenko, O., LuValle, P., McCarthy, M., Muragaki, Y., Nishimura, I., Oh, s., Rosenblum, N.,N.,SUgrUe, s.,Taylor, R-9 Vasios, G., Yamaguchi, N., and Olsen, B. R. (1990)in Extracellular Matrix Genes (Sandell, L. J., and Boyd, c. D., eds) PP. 79-114,Academic Press, San Diego Nishimura, I., Muragaki, Y., and Olsen, B. R. (1989)J. B i d . Chem. 264,20033-20041 Noro, A., Kimata, K., Oika, Y., Shinomura, T., Maeda, N., and

Suzuki, S. (1983)J. Biol. Chern. 258,9323-9331 Sandell, L. J., andD. Boyd, C. (1990)in Extracellular Genes Matrix (Sandell, L. J., and Boyd, C. D., eds) pp. 1-56, Academic Press, San Diego Sanger, F., Nicklen, S., and Coulson, A. R. (1977)Proc. Natl. Acad. sci. u. s. A . 74,5463-5467 Shimokomaki, M., Wright, W., D. Irwin, M. H., van der Rest, M., and Mayne, R. (1990)Ann. N . Y. Acad. sci. 580,1-7 Svoboh, K. K.9 Nishimura, 1.9 s- p.9 NinomiYa, y.9 and R. Olsen, B.Sci.Acud. (1988) Natl.Proc. U.S. A. 85, 7496-7500 van der Rest, M., and Garrone, R. (1991)FAsEB J. 5,2814-2823 van der Rest,M., and MaY% in Structure and Function Of Types Colhgen (Mayne, R., and Burgeson, E., R. eds) pp. 195-221, Academic Press*New van der Rest, M., and Mayne, R. (1988)J. Biol. Chem. 263, 1615van der Rest, M., Mawe, R., Ninomiya, Y., Seidah, N. G., Chitien, M., and Olsen, B. R. (1985)J. Bwl. Chem. 260, 220-225 G., Ninomiya, y., and Olsen, R. (1987)in Of Extrucellular Matrix:Structure and Function of Collugen Types , (May% R.9 and Burgeson, RVeds) PP. 283-3099 kX~demicPress, San Diego Vasios, G., Nishimura, I., Konomi, H., van der Rest, M., and Olsen, B. R. (1988)J. Bioi, 263, 2324-2329 von Heijne, G. (1983)Eur. J.Bwchem. 133,17-21 Vaughan, L., Mender, M., H u b r , S., Bruckner, p., Winterhalter, K., Irwin, M. H., and Mayne, R. (1988)J. Cell Biol. 106,991-997 ~R ~~~ ~w ~.~ h59, .e ~ . vuorio, E.,and de Crombrugghe, B. (1990)A 837-872 WU,J.-J., and Eyre, D.R. (1984)B k h e m . Bwphys. Res. Commun. 123, 1033-1039 WU,J.-J., and Eyre, D. R. (1989)Connect. Tissue Res. 20,241-246 Yada, T., Suzuki, S., Kobayashi, K., Kobayashi, M., Hoshino, T., Horie, K., and Kimata, K. (1990)J. Bwl. Chem. 265,6992-6999 Yanisch-Perron, C., Viera, J., and Messing, J. (1985)Gene (Amst.) 33, 103-119