has also been detected in the variable region (Screaton et al., 1993; Bennett ...... Bosworth, B. T., St. John, T., Gallatin, W. M., and Harp, J. A. (1991). Mol.
CD44: Structure, Function, and Association with the Malignant Process David Naor, Ronit Vogt Sionov, and Dvorah Ish-Shalom The Lautenberg Center for General and Tumor Immunology, The Hebrew University-Hadassah Medical School, Jerusalem 91 120, Israel
I. Introduction
11. CD44 Nomenclature
111. CD44 Biochemical Structure A. Genomic Organization B. Standard CD44 C. CD44 Isoforms D. Association of the CD44 Cytoplasmic Tail with the Cytoskeleton IV. CD44 Expression on Normal Cells A. CD44 Isoforms in the Developing Embryo and during Maturation of the Hematopoietic System B. CD44 Isoforms Expressed on Epithelial Cells and on Other Nonhematopoietic Cells of the Adult C. CD44 Isoforms Expressed on Nonactivated and Activated Hematopoietic Cells V. Hyaluronic Acid Is the Principal Ligand of CD44 A. The Structure of Hyaluronic Acid and Its Distribution in Normal and Tumor-Invaded Tissues B. Evidence That CD44 Is an HA Receptor C. Binding of HA to CD44 Variants D. The Topography of the CD44 HA Binding Sites E. Influence of the CD44 Cytoplasmic Tail and Cytoskeleton-Related Proteins on HA Binding to CD44 F. CD44 Activation Allows HA Binding in Some CD44' Nonbinder Cell Types G. The Effect of CD44 Glycosylation on HA Binding VI. Non-HA Ligands of CD44 VII. Soluble CD44 VIII. Genetic Control of CD44 Expression IX. CD44 Functions A. CD44 Is a Coaccessory or Independent Receptor Involved in Transmitting Growth Signals B. CD44 Is a Homing Receptor C. Cell Binding to Endothelium or ECM via the CD44 Receptor D. Cell Surface CD44 Involvement in HA Internalization and Enzymatic Degradation E. CD44-Dependent Cell Traffic F. CD44-Dependent Cell Aggregation
Advances in CANCER RESEARCH 0065-230)(/97$25.00
Copyright 0 1997 by Academic Press. All rights of reproduction in any form reserved.
242
X.
XI.
XII.
XIII. XIV.
David Naor et al. G. CD44 Influence on Hematopoiesis and Apoptosis H. Cytokine and Growth Factor Presentation by CD44 Involvement of CD44 in Physiological and Pathological Cell Activities A. The Role of CD44 in Wound Healing B. Endometrial CD44 Expression during the Human Menstrual Cycle C. CD44' Cytolytic T Cells Protect against Malaria Infection D. CD44 in Rheumatoid Arthritis and Inhibition of Experimental Arthritis with Anti-CD44 mAb E. CD44 and the Immunodeficiency Virus CD44 Association with the Malignant Process in Experimental Models A. Experimental Evidence for CD44 Involvement in Malignant Processes B. Prevention of Primary Tumor Growth andor Metastatic Spread in Experimental Models by Reagents Interfering with the CD44-Ligand Interaction CD44 Expression in Human Neoplasms and Its Correlation with the Malignant Status A. Tumors of the Nervous System B. Head and Neck Tumors C. Respiratory Tract Cancer D. Alimentary Tract Cancer E. Other Alimentary Tract Cancers F. Genitourinary Tract Cancer G. Gynecological Cancer H. Breast Cancer I. Melanomas J. Non-Hodgkin's Lymphoma and Chronic Myeloid Leukemia CD44 Association with Malignancy: Some Practical Comments Conclusions References
CD44 is a ubiquitous multistructural and multifunctional cell surface adhesion molecule involved in cell-cell and cell-matrix interactions. Twenty exons are involved in the genomic organization of this molecule. The first five and the last 5 exons are constant, whereas the 10 exons located between these regions are subjected to alternative splicing, resulting in the generation of a variable region. Differential utilization of the 10 variable region exons, as well as variations in N-glycosylation, 0-glycosylation, and glycosaminoglycanation (by heparan sulfate or chondroitin sulfate), generate multiple isoforms (at least 20 are known) of different molecular sizes (85-230 kDa). The smallest CD44 molecule (85-95 kDa), which lacks the entire variable region, is standard CD44 (CD44s).As it is expressed mainly on cells of lymphohematopoietic origin, CD44s is also known as hematopoietic CD44 (CD44H). CD44s is a single-chain molecule composed of a distal extracellular domain (containing the ligand-binding sites), a membraneproximal region, a transmembrane-spanning domain, and a cytoplasmic tail. The molecular sequence (with the exception of the membrane-proximal region) displays high interspecies homology. After immunological activation, T lymphocytes and other leukocytes transiently upregulate CD44 isoforrns expressing variant exons (designated CD44v). A CD44 isoform containing the last 3 exon products of the variable region (CD44V8-10, also known as epithelial CD44 or CD44E), is preferentially expressed on
CD44 and the Malignant Process
243
epithelial cells. The longest CD44 isoform expressing in tandem eight exons of the variable region (CD44V3-10) was detected in keratinocytes. Hyaluronic acid (HA), an important component of the extracellular matrix (ECM), is the principal, but by no means the only, ligand of CD44. Other CD44 ligands include the ECM components collagen, fibronectin, laminin, and chondroitin sulfate. Mucosal addressin, serglycin, osteopontin, and the class I1 invariant chain (Ii) are additional, ECM-unrelated, ligands of the molecule. In many, but not in all cases, CD44 does not bind HA unless it is stimulated by phorbol esters, activated by agonistic anti-CD44 antibody, or deglycosylated (e.g., by tunicamycin). CD44 is a multifunctional receptor involved in cell-cell and cell-ECM interactions, cell traffic, lymph node homing, presentation of chemokines and growth factors to traveling cells, and transmission of growth signals. CD44 also participates in the uptake and intracellular degradation of HA, as well as in transmission of signals mediating hematopoiesis and apoptosis. Many cancer cell types as well as their metastases express high levels of CD44. Whereas some tumors, such as gliomas, exclusively express standard CD44, other neoplasms, including gastrointestinal cancer, bladder cancer, uterine cervical cancer, breast cancer and non-Hodgkin’s lymphomas, also express CD44 variants. Hence CD44, particularly its variants, may be used as diagnostic or prognostic markers of at least some human malignant diseases. Furthermore, it has been shown in animal models that injection of reagents interfering with CD44-ligand interaction (e.g., CD44s- or CD44v-specific antibodies) inhibit local tumor growth and metastatic spread. These findings suggest that CD44 may confer a growth advantage on some neoplastic cells and, therefore, could be used as a target for cancer therapy. It is hoped that identification of CD44 variants expressed on cancer but not on normal cells will lead to the development of anti-CD44 reaEents restricted to the neoplastic mowth.
I. INTRODUCTION Cancer progression is dependent on a cascade of genetic alterations, each one essential, but not sufficient, to afford the tumor a growth advantage. The selection of a malignant phenotype, a rare event that occurs in only a small number of cells, is determined by the successful and complete accumulation of many metastasis-supporting activities. These include induction of vascularization, detachment from the primary tumor mass, digestion and invasion of neighboring tissues, the crossing of blood and lymph endothelium (intravasation), and hematogenous or lymphogenous migration toward the target organ. Acquisition of the ability to form aggregates and re-traverse the endothelium (extravasation) allows the tumor cells to be arrested in the capillary beds and then to colonize the secondary tissue (Poste and Fidler, 1980; Nicolson, 1987; Fidler, 1995).Upon analyzing the different phases of tumor progression, it is clear that this process is tightly associated with the activity of various adhesion molecules. In fact, malignant cells exploit the same adhesive functions used by normal cells in maintaining their routine physiological activities. Normal adhesive functions are carried out by distinct families of mole-
244
David Naor et al.
cules, each structurally different from the other. The integrins are composed of two noncovalently associated subunits designated (Y and p. The same p subunit can interact with different (Y subunits and, conversely, the same (Y subunit can combine with different p subunits, yielding multiple heterodimer products. For example, the combination of an cxL subunit with a Pf subunit generates the leukocyte function-associated antigen-1 (LFA-1; ctLPz) integrin, and interaction of the a4 subunit with the p1 subunit yields the very late antigen-4 (VLA-4; a4p1)integrin (Albelda and Buck, 1990; Dustin and Springer, 1991). The integrins (e.g., LFA-1) are implicated in interactions between CD4+ T cells and antigen-presenting cells (APC) or between CD8’ cytotoxic T cells and their target cells (Dustin and Springer, 1991). Some integrins, such as LFA-1 and VLA-4, are also involved in the binding of leukocytes to endothelial cells (Springer, 1994) and in the adherence of various cell types (including neoplastic cells) to extracellular matrix (ECM) components (Ruoslahti and Pierschbacher, 1987; Ruoslahti, 1991; Yamada, 1991). The immunoglobulin superfamily is a functionally diverse group of molecules with variable numbers of extracellular immunoglobulin-like domains (Bierer and Burakoff, 1991; Dustin and Springer, 1991; Springer, 1994). Some members of this family, such as intercellular adhesion molecule1 (ICAM-1) and vascular cellular adhesion molecule-1 (VCAM-l), are integrin counterreceptors: the ICAM-1 of an APC can interact with the LFA-1 of T cells at the antigen recognition phase of the immune response, and the VCAM-1 of endothelial cells binds to the VLA-4 of T lymphocytes at the “arrest” stage of the extravasation process (Dustin and Springer, 1991; Shimizu et al., 1992; Springer, 1994). Other members of this group also interact with one another (e.g., the CD2 of the T cell and the leukocyte function-associated antigen-3 [LFA-31 of an APC; Dustin and Springer, 1991). The Ca2+-independentcell adhesion molecules (CAM) (e.g., neural cell adhesion molecule [N-CAM], Ng-CAM), most of which are localized in neural tissues, are a unique immunoglobulin superfamily subgroup. These molecules are involved in homophilic interactions during histogenesis (Albelda and Buck, 1990). The cadherins are calcium-dependent CAM that contain internal repeats (of -100 amino acids) that do not resemble immunoglobulin domains. Like the calcium-independent CAM, they are also involved in homophilic adhesion and in organizational events occurring in the course of embryogenesis. In addition, the structural integrity and polarity of adult tissues may also be dependent on cadherins (Albelda and Buck, 1990).Selectin molecules are composed of a lectin-like domain, an epidermal growth factor-like domain, and a variable number of complement-regulatory repeat sequences. L-selectins, expressed on leukocytes, and P- and E-selectins, expressed on endothelial cells, are implicated in leukocyte attachment to, and rolling and arrest on, endothelium. This sequence of events leads to extravasation of leukocytes into inflamed tissues and to lymphocyte homing through the high endothelial venule into lymph nodes (Shimizu et al., 1992;
CD44 and the Malignant Process
245
Springer, 1994). Addressins are mucin-like molecules (heavily O-glycosylated proteins rich in serine and threonine) whose sialylated carbohydrates (sialyl Lewis-like) adhere to selectins, thus allowing leukocyte-endothelial cell interactions (Springer, 1994). Cartilage link proteins and proteoglycan core proteins are ECM components that interact with hyaluronan (Doege et al., 1987; Goetinck et al., 1987; Neame et al., 1987). The CD44 molecule, which is the focus of this article, is a member of this group. Adhesion molecules have both a negative and a positive influence on the malignant process. On the one hand, they (e.g., E-cadherin or aJ3, integrin) prevent the detachment of potential metastatic cells from the primary tumor by maintaining cell-cell contact (Behrens et al., 1989; Qian et al., 1994). On the other hand, as mentioned before, they enable the binding to endothelium of intravasated or extravasated metastatic cells (Nicolson, 1982a,b; Fidler, 1995) and the latter’s migration along blood vessels, lymphatic vessels, and matrices (Nicolson, 1982a; Ruoslahti, 1992, Fidler, 1995). In addition, adhesion molecules are involved in homophilic (Updyke and Nicolson, 1986) and heterophilic (Fidler and Bucana, 1977; Fidler, 1995) aggregate formation, thereby arresting the metastatic cells in remote organ capillaries and conferring immunoresistance on them. They also support the docking of metastatic cells in secondary organs (Liotta and Stetler-Stevenson, 1991; Nakai et al., 1992; Ruoslahti, 1992; Taraboletti et al., 1993). Therefore, like all other factors involved in the malignant process, CAM are essential to, but not sufficient for, neoplastic progression. As their functions are rate limiting, blocking of adhesion molecules at any of the metastatic phases (aside from the first “detachment” step) may grind the entire metastatic cascade to a halt. The various tumor cells exploit different families of adhesion molecules, which function at distinct phases of the malignant process. Furthermore, even the same tumor can express various adhesion molecules that may be utilized at the same or different phases of the metastatic cascade (Zahalka et al., 1995). When evaluating the role of adhesion molecules in the metastatic process, the CD44 family of molecules deserves considerable attention in view of its adhesive, locomotive, and growth-transducing functions as well as its prevalence among cancer cells. The comprehensive review by Lesley et al. (1993a) will provide the reader with the relevant CD44 literature prior to 1993.
11. CD44 NOMENCLATURE
CD44 was first described as brain-granulocyte-T lymphocyte antigen (Dalchau et al., 1980).Like many cell surface and bioactive molecules, CD44 was accorded various structural or functional names that merged to an es-
246
David Naor et al.
tablished single name only after molecular sequencing. The synonyms GP9OHermeS, extracellular matrix receptor 111, homing cell adhesion molecule, phagocytic glycoprotein-1, glycoprotein 85, Ly-24, hyaluronate receptor, HUTCH-1, and In (Lu)-relatedp80 glycoprotein are all included in the CD44 cluster designation assigned by the Third International Workshop on Leukocyte Typing (Cobbold et al., 1987).
111. CD44 BIOCHEMICAL STRUCTURE
A. Genomic Organization CD44 is a family of molecules consisting of many isoforms. The molecular diversity of this glycoprotein is generated by both posttranslation modification and differential utilization of alternatively spliced exons. CD44 is encoded by a single gene, located on the short arm of chromosome 11in humans (Goodfellow et al., 1982) and on chromosome 2 in mice (Colombatti et al., 1982), spanning approximately 50 kb of human DNA (Screaton et al., 1992). CD44 genomic organization, described by Screaton et al. (1992,1993), involves 20 exons in both mouse and humans (Fig. 1B). The first 5 exons coding for the extracellular domain (designated the 5’ constant region) are constant in both species (exons 1-5), whereas the next 10 exons are subjected to alternative splicing. This generates a variable region (see Fig. 1A) containing different exon combinations (Brown et al., 1991; Dougherty et al., 1991; Cooper et al., 1992; He et al., 1992; Jackson et al., 1992; Screaton et al., 1992). Variable region exons are designated V1 (exon 6 on the sequential scale) to V10 (exon 15 on the sequential scale; V stands for variant exon). Note, however, that exon V1 is not expressed in humans (Screaton et al., 1993). Exons 16 and 17 are the first two constant exons of the 3‘ constant region, and they, together with part of exon 5, encode the membraneproximal region of the extracellular domain (with optional inclusion of variant exons). The next domain (i.e., the hydrophobic transmembrane region) is encoded by exon 18 of the 3’ constant region. The cytoplasmic domain is also subject to alternative splicing. Differential utilization of exons 19 and 20 generates the short version (3 amino acids) and the long version (70 amino acids) of the cytoplasmic tail, respectively. The first 3 amino acids, common to both tails, are encoded by exon 18. The DNA sequence of exon 19 carries a long A+T tract, possibly causing instability in the mRNA of the short version. The additional amino acids of the long cytoplasmic domain are encoded by exon 20. The long version of the cytoplasmic tail is much more abundant than the short version (Goldstein and Butcher, 1990; Screaton et al., 1992).
CD44 and the Malignant Process
247
B. Standard CD44 The most abundant version of CD44 is the standard one (CD44s, 85-95 kDa) (Stamenkovic et al., 1991), which lacks the entire variable region, with exon 5 of the 5’ constant region being directly spliced to exon 16 of the 3’ constant region (Idzerda et al., 1989; Nottenburg et al., 1988; Zhou et al., 1989; Aruffo et al., 1990; Wolffe et al., 1990; Bosworth et al., 1991; Harn et al., 1991; He et al., 1992). CD44s is expressed mainly on hematopoietic cells and is therefore also designated hematopoietic CD44 or CD44H. Northern blot analysis of RNA isolated from hematopoietic cell CD44 revealed three major transcripts in humans (-1.6,2.2, and 4.8 kb) (Goldstein et al., 1989; Stamenkovic et al., 1989; Quackenbush et al., 1990) and three (-1.6, 3.5, and 4.5 kb) (Haegel and Ceredig, 1991) or four (1.6, 3.2, 4.0, and 5.2 kb) (Wolffe et al., 1990) in the mouse. Utilization of multiple polyadenylation signals may explain this heterogeneity (Harn et al., 1991). The human (and primate) CD44s mRNA is translated to 361 (in mouse, 363) amino acids. The predicted size of the core protein is 37-38 kDa (Goldstein et al., 1989; Idzerda et al., 1989; Nottenburg et al., 1989; Stamenkovic et al., 1989; Zhou et al., 1989; Screaton et al., 1992). However, the molecular size of CD44 is considerably increased by posttranslational modifications. Extensive glycosylation (see Fig. 1A) may occur in the mouse at 5 potential N-linked carbohydrate sites and at 10 potential O-linked carbohydrate sites (Zhou et al., 1989), and, respectively, at 6 and 7 potential sites in humans (Goldstein et al., 1989).This posttranslational modification doubles the molecular size of CD44s, bringing it from 38 kDa to 85-95 kDa (Zhou et al., 1989; Lokeshwar and Bourguignon, 1991). Indeed, tunicamycin treatment of mouse CD44 and subsequent immunoprecipitation revealed a 42-kDa molecular species (Lokeshwar and Bourguignon, 1991), a size quite close to the predicted one. Four serine-glycine motifs in the human CD44 extracellular domain and three in the mouse (Goldstein et al., 1989; Stamenkovic et al., 1989; Zhou et al., 1989) can be modified by the glycosaminoglycan (GAG) heparan sulfate (HS) (Brown et al., 1991; Tanaka et al., 1993) or by chondroitin sulfate (CS) (Jalkanen et al., 1988; Stamenkovic et al., 1989, 1991), thereby converting the molecule to a proteoglycan with possibly altered ligand specificity (Faassen et al., 1992; Jalkanen and Jalkanen, 1992). Chondroitin sulfate modification increases the size of the CD44 molecule from 85-95 kDa to 180-200 kDa (Jalkanen et al., 1988). Phosphorylation of the CD44 cytoplasmic tail (Isacke et al., 1986; Kalomiris and Bourguignon, 1988; Neame and Isacke, 1992) is another optional posttranslational modification with potential functional consequences (see Section V.E). The CD44 glycoprotein is an acidic (isoelectric point = 4.2) molecule, its charge largely due to sialic acid (Jalkanen et al., 1988). The t+of CD44 turnover was found to be 8 h (Lokeshwar and Bourguignon, 1991). As previously indicated, the CD44s molecule is composed of several do-
A. GLYCOPROTEIN
x
X
x
10 amino acids H
B. EXON MAP
5 1 LP
2
3
4
5
6
7
CD44s (short tail)
2.
CD44s (long tail)
3.
pMeta-1
4.
pMeta-2
5.
Epithelial CD44
6.
CD44v of keratinocytes
9
' 10
11
12
13
14
15
16
17
18 TM
C. ALTERNATIVELY SPLICED TRANSCRIPTS I.
8
-
13 ' 19 20 CT
Fig. I Schematic representation of CD44 glycoprotein (A), its exon map (B), and examples of six alternatively spliced transcripts (C).
(A) Protein structure. Using disulfide bonds, the amino terminus of the molecule forms a globular domain, or three globular subdomains. The circle and the “downstream” ellipse represent areas that influence hyaluronate (HA) binding (Peach et al., 1993; Zheng et al., 1995). The black track inside the circle refers to a region displaying 30% homology with cartilage link protein and proteoglycan core protein, both showing HA binding ability. The black track at the amino terminus (inside and outside the circle), transmembrane-spanning domain (23 amino acids), and cytoplasmic tail (70 amino acids) represents regions with 80-90% interspecies homology. The alternatively spliced short cytoplasmic domain ( 3 amino acids) is nonproportionately represented by a small bar. The lightly shaded track in the center indicates the nonconserved membraneproximal region, which displays 3 5 4 5 % interspecies homology. The optional variable region, containing various combinations of variant exon products (see part C of legend), is inserted between amino acids 201 and 202 (mature protein) and marked by a zigzag track. The full amino acid sequence of human and mouse CD44s is presented in Zhou et al. (1989) and the nucleotide sequence (including the variable region) of human CD44 in Screaton et al. (1992). 0,potential N-linked glycosylation sites (only those of standard CD44 are shown); X, areas rich in serine and threonine, possible sites for 0-linked glycosylation (those of the variable region are arbitrarily assigned); , potential sites for GAG (CS or HS) incorporation, @, potential sites for phosphorylation (only part of the sites are depicted). The symbols on the standard part of the molecule mostly refer to mouse CD44 (Zhou et d., 1989),whereas those of the variable region are based on information taken from both mouse and humans. (B) Exon map. The filled circles represent exons of the constant regions. Empty circles represent exons that can be inserted by alternative splicing in the variable region. Note that exon V1 is not expressed in the human CD44. LP, leader peptide-encoding exon; TM, transmembrane-encoding exon; CT, cytoplasmic tail-encoding exons. (C) Examples of alternatively spliced transcripts. 1 and 2, standard CD44 with short and long cytoplasmic tails, respectively, which lacks the entire variable region. 3, pMeta-1 (CD44V4-7). Exons V4, VS, V6, and V7 are inserted in tandem between exons 5 and 17.4, pMeta-2 (CD44V6,7). Exons V6 and V7 are inserted between exons 5 and 17. pMeta-1 and pMeta-2 are known as “metastatic” CD44, bccause their cDNA confers, upon transfection, metastatic potential on nonmetastatic rat pancreatic tumor cells (Giinthert et al., 1991).Note that exon 16 is not expressed in either pMeta-1 or pMeta-2.5, epithelial CD44 (CD44V8-10), expressed preferentially on epithelial cells. Exons V8, V9, and V10 are inserted between exons 5 and 16.6, keratinocyte CD44 (CD44V3-10), one of the largest CD44 molecules known. Exons V3 through V10 are inserted between exons 5 and 16.
+
250
David Naor et al.
mains. The human extracellular domain contains 248 amino acids (mouse, 250; the first amino acid of the mature protein is designated residue 1, and the entire precursor protein sequence is termed the primary protein). Included in this domain is the amino terminus region, a stretch of about 90 relatively hydrophobic residues (amino acids 12-101) that displays an approximately 80-90% sequence similarity among species and also a relatively high homology (-30%) with cartilage link protein and proteoglycan core protein. The amino terminus region contains six cysteine residues, which are possibly utilized to form a globular domain or three globular subdomains. The membrane-proximal region of the extracellular domain is less well conserved, with approximately 3 5 4 5 % interspecies sequence similarity. The single transmembrane-spanning domain contains 23 amino acids in both mouse and humans, with 80-90% homology among different species. The CD44 cytoplasmic tail is highly conserved, as evident from the approximately 80-90% sequence similarity among species (Goldstein et al., 1989; Nottenburg et al., 1989; Stamenkovic et al., 1989; Zhou et al., 1989; Screaton et al., 1992). The human CD44 cytoplasmic tail contains 6 potential phosphorylation sites, and the mouse CD44 cytoplasmic tail 10. At least part of these sites are constitutively phosphorylated (e.g., serines 303 and 305 of the mature protein) (Isacke et al., 1986; Carter and Wayner, 1988; Camp et al., 1991; Lesley et al., 1993a; PurC et al., 1995). Although the cytoplasmic domain includes consensus phosphorylation sites for protein kinase C (PKC) and protein kinase A, as well as for CAMP- and cGMP-dependent protein kinases, there is no evidence that these sites are active (Wolffe et al., 1990; Camp et al., 1991).
C. CD44 Isoforms CD44s is a ubiquitous molecule expressed mainly on leukocytes but also on fibroblasts and cells of mesodermal and neuroectodermal origin. However, CD44 molecules of larger size appear on different normal and malignant cells. Alternative splicing, as well as differential posttranslational modifications (glycosylation and glycosaminoglycanation or, in short, glycanation) of distinct CD44 isoforms, enrich the CD44 repertoire, which, in turn, may increase the optional functions of this molecule. Individual cells, whether normal or malignant, can simultaneously express different CD44 isoforms. Insertion of all the variable region between amino acids 202 and 203 of the mature human CD44 sequence (see Fig. 1A)generates an extra stretch of 381 amino acids (mouse, 423), which show 64% interspecies homology (Screaton et al., 1993). The entire CD44 variable region (exon V1 to exon V10) reveals four additional potential N-glycosylation sites and a large number of 0-glycosylation sites (especially at exons V2 and V8-V10: e.g., V9 and V10
CD44 and the Malignant Process
25 1
exon products contain 40% and 30% serine and threonine residues, respectively). In addition, one motif (serine-glycine) for potential insertion of GAG has also been detected in the variable region (Screaton et al., 1993; Bennett et al., 1995b). CD44 variants containing the exon V3 product can be decorated by HS (Bennett et al., 1995a), and variants containing the V6 exon product can be modified on the colon carcinoma cell line by an H blood group sugar (LabarriGre et al., 1994).Interestingly, the use of additional variant exons (e.g., CD44V4-7) may also enhance phosphorylation of the CD44 cytoplasmic tail (Ponta et al., 1994-95). To date, at least 20 different CD44 transcripts have been described. However, this is most likely not the final word. Theoretically, 768 membrane-bound CD44 isoforms can be generated by alternative use of the variant exons (van Weering et al., 1993).Another example of such diversity is N-CAM, which, by alternative splicing, can generate 27 isoforms out of 54 theoretical possibilities (Reyes et al., 1991). In addition to CD44s (described in Section IILB), several other CD44 isoforms deserve special attention (Fig. lC), as they are frequently described in this chapter. Epithelial CD44 (CD44E), which is preferentially expressed on epithelial cells (see Section N.B) (Brown et al., 1991; Dougherty et al., 1991; Hofmann et al., 1991; Stamenkovic et al., 1991; He et al., 1992; Jackson et al., 1992), utilizes exons V8, V9, and V10 of the variable region in conjunction with the 5’ and 3‘ constant regions (Fig. 1C) to generate the CD44V8-10 isoform (130 kDa; Stamenkovic et al., 1991). pMeta-1 and pMeta-2 are known as “metastatic” isoforms because, upon transfection, their cDNAs confer metastatic potential on nonmetastatic cells (Giinthert et al., 1991). In pMeta-1, exons V4, V5, V6, and V7 are inserted between the 5’ and 3’ constant regions, generating the CD44V4-7 variant. Similarly, in pMeta-2, exons V6 and V7 are inserted between the same constant regions to produce the CD44V6,7 variant. Note that “constant” exon 16 is not expressed in the 3’ constant region of the two variants (Herrlich et al., 1993) (Fig. 1C).The longest CD44 isoform was detected in keratinocytes. Here exons V3-V10 are inserted in tandem between the two constant regions to generate the CD44V3-10 isoform (230 kDa) (Hofmann et al., 1991) (Fig. 1C). The long additional sequence (338 amino acids) provides additional glycosylation, particularly 0-glycosylation, sites and an attachment sequence for sulfated GAGS (HS or CS) (Brown et al., 1991; Haggerty et al., 1992; Bennett et al., 1995b), thereby further increasing its molecular size. CD44 variants (abbreviated as CD44v) can be identified by monoclonal antibodies (mAbs)directed against specific exon product epitopes, using flow cytometry or immunohistochemical (IHC)analysis. Such assays provide partial information, as they do not allow the identification of the full-length sequence but rather define exclusive epitope(s) on a specific exon product(s). To illustrate this limitation, exon products identified by antibody are marked by a small horizontal bar above the V affiliated to the exon number. For ex-
252
David Naor et a/.
ample, the V6 exon product identified by anti-V6 mAb is registered as Cd44V6 to emphasize that the isoform contains the V6 product. It is not known if this product is included in the sequence of other exons or if it is expressed independently. The missing information can be provided by reverse transcriptase-polymerase chain reaction (RT-PCR) (see later). In addition, mAbs that recognize epitopes on the CD44 constant region do not distinguish between CD44s and CD44v, as this region is shared by all CD44 isoforms. Such antibodies are defined as anti-pan-CD44 mAb (or, in most cases, simply anti-CD44 mAb) and the CD44 recognized by them is designated pan-CD44, or simply CD44. Furthermore, identification, following electrophoresis, of CD44 molecular species larger than 85-95 kDa by antibodies recognizing the CD44 constant region does not necessarily imply that spliced variants are present, since posttranslational modification may also account for the higher molecular size. Full-length CD44 isoform sequences are detected by RT-PCR and nucleotide sequencing. Abbreviated registration of all the exons expressed in a particular CD44 molecule is used to identify the variants (e.g., CD44V4-7 denotes pMeta-1). Finally, for reasons of convenience, we will frequently use the term “exon” in conjunction with protein CD44, although the term “exon product” is more accurate.
D. Association of the CD44 Cytoplasmic Tail with the Cytoskeleton According to several reports, the CD44 cytoplasmic tail interacts with cytoskeleton-related components such as actin, ankyrin, or members of the ezrin-radixin-moesin family (Lacy and Underhill, 1987; Kalomiris and Bourguignon, 1988; Bourguignon et al., 1991,1992,1993; Lokeshwar and Bourguignon, 1991, 1992; Lokeshwar et al., 1994; Tsukita et al., 1994). A number of modifications, including PKC-mediated phosphorylation (Kalomiris and Bourguignon, 1989; Bourguignon et al., 1992), palmitoylation (Bourguignon et al., 1991), and GTP binding (Lokeshwar and Bourguignon, 1992; Galluzzo et al., 1995), may regulate the interplay between CD44 cytoplasmic tail and cytoskeleton-related proteins and, perhaps, the subsequent CD44-ligand interaction. The observation that in some cells a fraction of the CD44 molecules is resistant to extraction by detergent has been interpreted as evidence of CD44 interaction with the cytoskeleton (Tarone et al., 1984; Lacy and Underhill, 1987; Carter and Wayner, 1988; Geppert and Lipsky, 1991). However, in fibroblasts, tailless CD44 mutants, which are not able to interact with the cytoskeleton, have also been found in the detergent-insoluble fraction (Neame and Isacke, 1993; Perschl et al., 1995a), suggesting the interplay of CD44 with lipids of the plasma membrane rather than with the cytoskeleton (Neame et al., 1995). In those cases where an as-
CD44 and the Malignant Process
253
sociation between CD44 and the cytoskeleton has been demonstrated, there are contradictory findings as to the phosphorylation dependence of this event. Bourguignon and colleagues (Kalormiris and Bourguignon, 1989; Bourguignon et al., 1992)suggest that the association is phosphorylation dependent, whereas other investigators (Camp et al., 1991; Neame and Isacke, 1992) do not favor this implication. Yet another study showed that activation of T lymphocytes with phorbol ester reduces the interaction between CD44 and the cytoskeleton (Geppert and Lipsky, 1991). Again, it is uncertain whether the phosphorylation status of the activated CD44 influences this effect. Variations in the internal andlor external cell environment as well as differences in methodology may account for all these discrepancies.
IV. CD44 EXPRESSION ON NORMAL CELLS A. CD44 Isoforms in the Developing Embryo
and during Maturation of the Hematopoietic System
As CAMS are indispensable for the generation of cell communities during fetal development (Fleming et al., 1994),the role of CD44 in this process has obviously attracted much attention. CD44 was detected by indirect immunofluorescence in early, preimplanted human embryos containing one to eight cells. The intensity of expression was maximal at the eight-cell phase and downregulated at the morula, blastocyst, and postimplantation stages (Campbell et al., 1995). During the preimplantation stage, CD44 may be involved in cell-cell homophilic interactions of embryonic blastomers or in the heterophilic adhesion between the inner cell mass and trophectodermal cells. Immunohistochemical studies with variant-specific mAbs revealed the expression of CD44v in 10-week-old human embryos. The predominant CD44v9: isoform was found in the epidermis, trachea, lung, thyroid gland, and mesonephric and paramesonephric ducts. CD44V6: was found in the epidermis and trachea (Terpe et al., 1994b). In newborn rats, the CD44V6 isoform has been identified in basal layers of the epidermis, hair follicles, the lower part of the crypts in the colon mucosa, and ductal epithelia of pancreatic glands (Wirth et al., 1993). Interestingly, expression of V6 exon is a hallmark of the proliferating mobile cells (see Section 1V.C) populating these tissues. Transcript analysis with RT-PCR of 7.5-day-postcoitum total mouse embryos revealed that CD44v (CD44V10 and CD44V8-10) are more intensively expressed in the fetus than is the standard isoform. In situ hybridization, using a V6-V10 sequence as a probe, indicates that the strongest expression of the CD44 variants is confined to the cell layer most proximal
254
David Naor et al.
to the amniotic cavity. Later, between days 9.5 and 12.5 postcoitum, CD44s predominates, but larger variants are found in heart, somites, and limb bud mesenchyme (Ruiz et al., 1995). CD44 glycoproteins have also been detected on fetal rabbit fibroblasts (Alaish et al., 1994), embryonal murine neuronal cells located in the optic chiasm (Sretavan et al., 1994), and (mostly CD44v) embryonal Schwann cells, providing the myelin sheath coating the axons (Sherman et al., 1996). The ductal epithelia of the pancreatic gland of newborn rats express the epitope encoded by CD44 exon 6 (Wirth et al., 1993). A full-length splice variant was detected in the rat apical ectoderm ridge (AER) on day 12.5 postcoitum. The AER promotes the proliferation of the underlying mesenchymal cells in the growing limb. The mesenchyme, however, expresses CD44s only. After treatment with mAb directed against an epitope encoded by the CD44 V6 exon, AER failed to support the outgrowth of the limb bud (Wainwright et al., 1996). Since a member of the fibroblast growth factor (FGF) family is able to replace the AER function (Niswander et al., 1993), and as the proteoglycan version of CD44 can present various growth factors (see Section IX.H), it is tempting to speculate that the role of AER CD44 variants is to present FGF-like growth factors to growing limbs (Wainwright et al., 1996). Embryonic hematopoietic cells vary in CD44 expression, according to differentiation stage. Pluripotent bone marrow stem cells express CD44 (reviewed by Lesley et al., 1993a), and CD44 is detected also on bone marrow cells about to populate the thymus (pre-T cells). The earliest mouse thymocyte population CD4- CD8- CD3- interleukin-2 receptor- [IL-2R-I) expresses CD44s. The differentiated progeny of these double-negative (CD4CD8-) cells undergo a transition from CD44' IL-2R- through CD44- IL2R' to CD44- IL-2R-, becoming double-positive (CD4' CD8+)cells that express the T-cell receptor (Husmann et al., 1988; Penit and Vasseur, 1989; Lesley et al., 1990b; Petrie et al., 1990; Scollay, 1991). At a later stage, CD44 reappears on T cells of mouse strains expressing the CD44.1 allele (e.g., BALB/c mice). In contrast, only small amounts of CD44 are expressed in CD44.2 mouse strains (e.g., C57BLl6 and AKWJ) (Lynch and Ceredig, 1988, 1989). The same sequence of CD44 expression has been observed in the fetal human thymus. The earliest cells migrating into the thymus express CD44. During cell differentiation, CD44 is downregulated or lost (excluding a small subpopulation of thymic non-T cells) and later reexpressed (de 10s Toyos et al., 1989; Horst et al., 1990a; MBrquez et al., 1995), mostly as CD44s (Mackay et al., 1994). CD44 isoform expression during fetal human thymus development has been analyzed by irnmunostaining of frozen tissues. CD44 was detected on both thymic epithelial cells and thymocytes beginning at 8.2 weeks of fetal gestation, the time of initial colonization by bone marrow stem cells. CD44
CD44 and t h e Malignant Process
255
variants containing V4, V6, and V9 exon products emerge later, at 10 weeks of fetal gestation, and are confined to terminally differentiated thymic epithelial cells within and surrounding Hassall bodies (Mackay et al., 1994; Terpe et al., 1994b; Pate1 et al., 1995). Early human B cells (CD10') express a low level of CD44. The transition to the CD44-high phenotype occurs relatively late in development, when the cells assume the CD20 phenotype (Kansas and Dailey, 1989). Bone marrow precursors (ie., granulocyte-macrophage colony-forming units and erythroid burst-forming units) express high levels of CD44 (Lewinsohn et al., 1990). Whereas adult rat hematopoietic cells express mostly CD44s (see Section IV.C), a large fraction of newborn bone marrow and mesenteric lymph node cells display,.as shown by flow cytometry, CD44 variants containing the V6 exon product (Arch et al., 1992). This indicates that CD44v expression is shared by both the hematopoietic and the nonhematopoietic embryonic cells.
B. CD44 Isoforms Expressed on Epithelial Cells and on Other Nonhematopoietic Cells of the Adult Comprehensive knowledge regarding cells expressing standard CD44 or its variants at distinct states of differentiation or activation might allow better evaluation of CD44 function. Moreover, quantitative and qualitative comparisons of CD44 expressed on tumors and on the corresponding normal cells may reveal variations that could be exploited for diagnostic and prognostic purposes (see Section XII). Using different techniques, such as mRNA analysis (including RT-PCR) (Stamenkovic et al., 1989; Quackenbush et al., 1990), IHC (Alho and Underhill, 1989), and flow cytometry (Mackay et al., 1994), CD44 was detected on multiple cell types. Earlier and even later studies, using antibodies specific for the CD44 constant region, did not distinguish between the different isoforms. Such CD44 molecules (also designated pan-CD44) have been detected on the epithelium of skin, cheek, tongue, esophagus, vagina, cervix, ovary, intestine, stomach, oviduct, bladder, the tubular region of the kidney, liver bile ducts, long bronchi, salivary gland, thyroid gland, mammary gland, endometrium, epididymis, prostate gland, pancreatic ducts, urinary tract, and Hassall corpuscles in the thymus (Alho and Underhill, 1989; Stamenokovic et al., 1989; Heider et al., 1993b; Behzad et al., 1994; Fox et al., 1994; Mackay et al., 1994; Terpe et al., 1994b; Yaegashi et al., 1995). The molecule is predominantly expressed in regions of active cell growth (Alho and Underhill, 1989; Mackay et al., 1994). The use of variant-specific mAbs and the relevant PCR primers has enabled more rigorous definition of the CD44 isoforms present in the previously
256
David Naor et al.
indicated epithelial tissues. These include, in addition to CD44s, CD44 isoforms expressing different combinations of variant exons such as CD44V3-10 or CD44V8-10, the most representative epithelial isoform (Fig. 1A) (Behzad et al., 1994; Iida and Bourguignon, 1995; Stauder et al., 1995). A CD44 variant containing V6 exon products (CD4486)has been detected in the basal layer of epidermis, on hair follicles, and on cryptic gut epithelium of adult and newborn rats (Wirth etal., 1993).In humans, both CD44V6 and CD44V4 were identified in epithelial cells of skin epidermis (CD4486 was evident particularly in the upper layers; Salmi et al., 1993), hair follicles, esophagus, tonsil, and thymic Hassall’s corpuscles. CD4486 (but not CD44V4) was detected in the epithelium of sweat glands, prostate gland, mammary gland, and lung bronchi. CD44v9 has been found in all of the previously mentioned epithelial tissues as well as in intestine, stomach, pancreatic ducts, the tubular region of the kidney, hepatic bile ducts, thyroid gland, salivary gland, endometrium, epididymis, urinary tract, and epithelial cells of mucosa-associated lymphoid organs (Mackay et al., 1994; Terpe et al., 1994b; Stauder et al., 1995). In another study, IHC staining of normal tissues revealed the presence of CD44v3, V4,5, 8 6 , and 88,9 variants in respiratory epithelium, transitional epithelium, and keratinizing and nonkeratinizing squamous epithelium. Expression of CD44V3, v4,5, and V6 was detected in placental cytotrophoblasts, thyroid follicular epithelium, skin adnexae, the myoepithelial layer of breast and prostate, and thymic Hassall’s corpuscles (Fox et al., 1994). In two studies the crypt epithelium of the gastrointestinal tract and the pancreatic ducts proved to be negative or only weakly positive for CD44 containing V4- or V6-encoded epitopes (Fox et al., 1994; Mackay et al., 1994). However, contradictory findings regarding the presence of CD44 variants in other epithelial tissues have been obtained. For example, thyroid and salivary glands were CD44v4- and CD44v6- in one study (Mackay et al., 1994) but positive in another (Fox et al., 1994). This discrepancy can be attributed to technical differences in methodology, varying degrees of staining sensitivity, or the use of antibodies that recognize different epitopes. The presence of CD44 on endothelium is also a matter of dispute, possibly related to similar causes. Some investigators (Alho and Underhill, 1989; Quackenbush et al., 1990) have reported that endothelial cells do not express CD44, whereas others (Pals et al., 1989a; Fox et al., 1994) demonstrated just the opposite. Furthermore, it has been shown by one group (Mackay et al., 1993), but not by another (Bennett et al., 1995a), that in endothelial cells tumor necrosis factor OL (TNFoL) upregulates CD44. CD44 has also been detected on smooth muscle (Pals et al., 1989a; Picker et al., 1989), fibroblasts (Flanagan et al., 1989; Pals et al., 1989a; Picker et al., 1989),melanocytes (Guo et al., 1994c),adrenal gland, and the choroid of the eye (Kennel et al., 1993). In addition, CD44 has been found in the
CD44 and the Malignant Process
257
white matter, especially on perivascular astrocytes (Picker et al., 1989; Girgrah et d., 1991b; Asher and Bignami, 1992, Vogel et d., 1992; Moretto et al., 1993; Salmi et al., 1993), and on Schwann cells of the peripheral nervous system in rats (Sherman et al., 1995). Upregulation of CD44V6 and downregulation of CD44s and CD44V9 were observed after incubation of human epithelial cell lines with interferon y (IFNy) (Mackay et al., 1994). Phorbol 12-myristate 13-acetate (PMA) or the combination of TNFa and IFNy enhanced CD44 mRNA and protein expression in murine astrocytes. Furthermore, PMA-stimulated astrocytes also express CD44 variants containing V6 and V10 exons (Haegel et al., 1993). Enhanced expression of CD44 mRNA has been detected also in human lung fibroblasts incubated with IL-la or TNF, and it is further augmented by a combination of the two (Sampson et al., 1992). The 29-kDa fragment of fibronectin or IL-la upregulated the expression of CD44s and CD44V10 in chondrocytes (the latter to a lesser degree), while simultaneously inhibiting proteoglycan synthesis (Chow et al., 1995). Analysis by RT-PCR revealed that, in human MCF-7 breast carcinoma cells, CD44V8-10 (CD44E) and CD44V8,9 isoforms were upregulated after treatment with hyaluronidase (Tanabe et al., 1993).
C. CD44 Isoforms Expressed on Nonactivated and Activated Hematopoietic Cells All types of hematopoietic cells, including erythrocytes, T and B lymphocytes, natural killer cells, macrophages, alveolar macrophages, Kupffer cells, and interdigitating and follicular dendritic cells, as well as granulocytes, preferentially express CD44s (Pals et al., 1989a; Quackenbush et al., 1990; Koopman et al., 1993; Fox et al., 1994; Mackay et al., 1994; Arai et al., 1995; Telen, 1995). CD44V4,5 and V6 have been detected on medullary thymocytes (Fox et al., 1994), and CD44V3, CD44V6, and CD44V10 are present on lymphocytes of reactive lymph nodes (Stauder et al., 1995). These findings lead to the conclusion that the CD44 repertoire of hematopoietic cells is far more restricted than that of epithelial cells. Memory and activated T cells have much higher levels of CD44 than do naive T cells; in addition, they express CD44 variants containing the V9 exon (Mackay et al., 1994; reviewed in Lesley et al., 1993a). Memory or activated CD44' T cells also express high levels of LFA-1, CD2, and CD58 (LFA3, the ligand of CD2) and low levels of CD45RO (a marker of naive T cells) (Sanders et al., 1988; Mackay et al., 1990). In addition, they produce high levels of IFNy (Budd et al., 1987). Sheep memory cells, expressing high levels of CD44 but lacking MEL-14, enter the peripheral lymph node exclusively via the afferent lymphatics (Mackay et al., 1990). The CD44' MEL-
258
David Naor et al.
14- mouse T-cell lymphoma utilizes the same route for lymph node invasion (Zahalka et al., 1995). A substantial change in the CD44 repertoire has been noted after cell activation. Three to 1 4 days after in vivo antigenic stimulation with allogeneic cells, rat T cells, B cells, and macrophages express, in addition to CD44s, CD44 variants containing the V6 exon product, as indicated by RT-PCR or relevant anti-CD44 variant-specific mAb (Arch et al., 1992). Similarly, in vitro mitogenic, allogeneic (mixed lymphocyte reaction), or antigenic activation of human T cells transiently upregulates cell surface expression of CD44 variants containing V6 and V9 exon products, as shown by V6- and V9specific mAbs (Koopman et al., 1993; Mackay et al., 1994). A more recent RT-PCR study (Stauder et al., 1995) implies that nonstimulated cloned T cells express CD44V3, CD44V6, and CD44V10 isoforms. This repertoire is changed after phytohemagglutinin stimulation to include larger variants (CD44V3-8, CD44V24, and CD44V8,9). It should be noted, however, that long-term culture may induce the production of CD44 isoforms that are not present in the primary cells (see introduction to Section XII). Injection of mAb directed against the V6-encoded epitope into mice simultaneously immunized with allogeneic cells or haptenxarrier conjugate inhibited their cytotoxic responses to the relevant allogeneic target cells or the humoral responses to both hapten and carrier (Arch et al., 1992). This finding suggests that cell surface expression of V6 in CD44 molecules is essential to the normal function of cells involved in the immune response. To ascertain the biological significance of lymphocytes expressing CD44 containing the V6 exon, Moll and his colleagues (1996) generated transgenic mouse strains whose T cells constitutively express the rat CD44V4-7 gene product. Lymphocytes of transgenic mice immunized with allogeneic cells or trinitrophenylated bovine serum albumin (TNP-BSA) displayed, following in vitro stimulation with the relevant antigens, earlier and more extensive cell proliferation than did lymphocytes from nontransgenic mice. The cytotoxic response of the transgenic lymphocytes against relevant allogeneic blast cells was also markedly accelerated and increased. Similarly, the primary in vitro proliferative responses of transgenic lymphocytes to allogeneic cells and TNP-BSA were higher and more rapid than those of nontransgenic lymphocytes. In the presence of mAb directed against the V6-encoded epitope of rat CD44v, the enhanced antigen-induced proliferative responses of the transgenic lymphocytes reverted to the level of the nontransgenic lymphocytes. These findings suggest that CD44V4-7 drives the lymphocytes to a constitutive preactivation state (analogous to that in memory T cells), resulting in their earlier entry into the S phase (Moll et al., 1996). It is shown in Section X1.A that tumor cells expressing CD44V4-7 have a growth advantage. CD44 transition has also been demonstrated during B-cell activation and differentiation in the germinal centers of human tonsil. High levels of CD44 are expressed on resting immunoglobulin (Ig) D+ IgM' cells. Following anti-
CD44 and the Malignant Process
259
gen activation, CD44 is downregulated at the early blast stage, when the blast marker CD38 emerges. During the blast (CD38' IgM') and centroblast (CD38' Ig-) stages, CD44 expression remains low or negative, but it is again upregulated upon transition to the centrocyte level. IgG' and IgA' cells at the postgerminal center stage express high levels of CD44 (Kremmidiotis and Zola, 1995). Similarly, mouse splenic B cells express CD44, which undergoes upregulation after stimulation with lipopolysaccharide (LPS), antiIgD-dextran, the supernatant of cloned Th2 cells, or IL-5 (Murakami et al., 1990; Hathcock et al., 1993). Normal human splenic B cells activated with anti-Ig antibody express increased levels of CD44s, CD44V6, and a CD44 variant containing the V10 exon product (Salles et al., 1993).Whereas resting human peripheral blood B cells express CD44s only, various CD44 variants (CD44E, CD44V10, CD44V6, and CD44V6,7) have been detected after stimulation with PMA, anti-IgM mAb, or IL-2. Epstein-Barr virus (EBV)-negativeBurkitt lymphoma (BL) cells do not express CD44. In contrast, CD44s, CD44E, and CD44V10 were detected in EBV' BL cells and EBV- BL cells infected with EBV (Kryworuckho et al., 1995),indicating that fewer variants are induced after viral activation. Bone marrow human plasma cells constitutively express CD44V9. Pulmonary macrophages express both CD44V6 and CD44V9. CD44 variants expressing V6 and V9 exons are upregulated in myelomonocytic cell lines after in vitro incubation with TNFa and IFNy, as indicated by flow cytometry and RT-PCR analysis (Mackay et al., 1994). It is difficult to ignore the finding that epithelial regions rich in proliferating cells, such as the basal cells of stratified squamous epithelium and glandular epithelium, express high levels of CD44v, especially isoforms containing the V6 exon. Similarly, activated leukocytes and epithelial cells upregulate V6- and V9-containing variants. The extensive locomotive and generative activities within the embryo are also accompanied by marked expression of CD44v. Again, the V6 version is particularly conspicuous. Malignant cells, which share many properties with normal adult and fetal cells of generative tissues, bear similar CD44 isoforms (see Section XII).
V. HYALURONIC ACID IS THE PRINCIPAL LIGAND OF CD44
A. The Structure of Hyaluronic Acid and Its Distribution in Normal and Tumor-Invaded Tissues Hyaluronic acid (HA; hyaluronate, hyaluronan) is a ubiquitous polysaccharide (GAG) consisting of a linear polymer of repeating disaccharide units with the structure (D-glucuronic acid [ 1-p-31 N-acetyl-D-glucosamine [ 1-p-
260
David Naor et af.
4]),. Hyaluronate has a high molecular mass (106-107 Da) (Laurent and Fraser, 1992). It is synthesized by fibroblasts (Teder et al., 1995), chondrocytes (Mason et al., 1989), and mesothelial cells (Honda et al., 1991; Heldin et al., 1992). The production of HA by human lung fibroblasts is stimulated by cytokines TNF, IFNy, and IL-1 and further augmented by a combination of TNF and IFNy, or of TNF and IL-1, as was shown in an in vitro assay (Sampson et al., 1992).Hyaluronate is an important component of the ECM, filling the intercellular spaces. Within the ECM, HA noncovalently interacts with proteoglycans, the binding stabilized by a link protein (Hardingham and Fosang, 1992; Laurent and Fraser, 1992). Hyaluronan is particularly abundant in connective tissues such as skin dermis, smooth muscle, lung, the lamina propria of mucous membranes, and the adventitia surrounding blood vessels (Aruffo et al., 1990). It is also present in the lymph (Laurent and Fraser, 1992)and lymph node matrix (Aruffo et al., 1990).Hyaluronic acid provides cellular support and a water-filled compartment. In addition, HA regulates cell-cell adhesion and the cell’s spatial orientation and traffic, as well as its growth and differentiation (Laurent and Fraser, 1992). Consequently, HA is involved in various biological processes such as inflammation (reviewed in Laurent and Fraser, 1992), wound healing, and tissue remodeling (West et al., 1985; Weigel et al., 1989; Laurent and Fraser, 1992; Oksala et al., 1995), as well as morphogenesis (Laurent and Fraser, 1992). Hyaluronan also interacts with the cell surface to form a “coat,” which may act as a protective cellular barrier (McBride and Bard, 1979; Gately et al., 1984). In addition, it supports the migration of invasive tumors (Toole et al., 1979). Indeed, tumor invasion is sometimes observed in regions with high concentrations of HA, such as the medullary and papillary interstitium of renal tissue, the submucosa of the gastrointestinal tract, around the centrilobular veins, and beneath the capsule of the liver (Sy et al., 1991).Some tumors synthesize and release H A into their immediate environment (Turley and Tretiak, 1985), while others stimulate the production of HA by surrounding fibroblasts (Knudson et al., 1984). Zhang and colleagues (1995) reported that intravenously injected mouse melanoma cells (B16-F1) expressing a high level of HA on their surfaces formed a greater number of lung metastases than did melanoma cells bearing low amounts of surface HA. Collectively, these findings emphasize the central role of HA in metastasis.
B. Evidence That CD44 Is an HA Receptor CD44 is the principal HA receptor, although the molecule can bind other ligands, in some cases at a lower affinity (see Section VI). The CD44 receptor coordinates a minimum of six HA sugar residues (three repeating disaccharides units), but has a higher affinity for longer HA molecules (Underhill
CD44 and the Malignant Process
26 1
et al., 1983), with a dissociation constant of 0.3 nM (Bourguignon et al., 1993). A hint that CD44 is a lectin-like receptor for HA came from a sequence comparison between known HA binding proteins (cartilage link protein and proteoglycan core protein) and CD44. The amino-terminal domain of CD44 (amino acid positions 12-101 of the mature protein in the human) displays about 30% sequence homology with the HA binding region of the previously mentioned proteins (Fig. 1A) (Goldstein et al., 1989; Stamenkovic et al., 1989), with the homology increasing to about 50% if conservative amino acid substitutions are considered (Goldstein et al., 1989). The evidence for a receptor-ligand relationship between CD44 and HA has been established by several experimental approaches. It has been demonstrated that the binding of plastic-immobilized or soluble fluoresceinated HA (Fl-HA) to CD44expressing cells can be prevented by some (but not all) anti-CD44 mAbs (Lesley et al., 1990a, 1992; Miyake et al., 1990b; Bennett et al., 1995b; PurC et al., 1995; Zahalka et al., 1995),an excess of soluble HA (Lesley et al., 1990a, 1992; PurC et al., 1995), or pretreatment of immobilized HA with hyaluronidase (PurC et al., 1995; Zahalka et al., 1995). After genetic fusion of cDNA encoding the extracellular domain of CD44 with genomic DNA segments encoding the IgG constant region, the construct was transfected into COS cells. The transfectants yielded a secretory, soluble CD44immunoglobulin G (CD44-lg) fusion protein that bound to lymph node high endothelial cells. Binding was blocked by the inclusion of low concentrations of HA, but not of other GAGS, in the assay system or by pretreatment of the endothelial cells with hyaluronidase (Aruffo et al., 1990). This finding implies that the HA expressed on high endothelial cells is recognized by soluble CD44. CD44- cells transfected with CD44 cDNA acquire the ability to interact with anti-CD44 mAb (Aruffo et al., 1990; Stamenkovic et al., 1991; Lesley et al., 1992) or to bind to lymph node high endothelial cells. Binding to endothelial cells can be inhibited by anti-CD44 mAb, soluble HA, or pretreatment with hyaluronidase (Stamenkovic et al., 1991; Lesley et al., 1992). In addition, transfection of CD44 cDNA into CD44- cells conferred on them the ability to bind F1-HA from solution or to adhere to plastic-immobilized HA. Again, binding was CD44 dependent (Lesley et al., 1992). Perhaps the clinching experiment proving that HA is a ligand of CD44 was the one showing that radiolabeled CD44 purified from placenta is able to bind immobilized HA, and that adherence is inhibited by anti-CD44 mAb, soluble HA, or hyaluronidase (St. Jacques et al., 1993). Similarly, soluble CD44-lg fusion protein bound HA in a cell-free system (Peach et al., 1993). Not all the tested anti-CD44 mAbs can block the binding of HA to cells expressing CD44 (Lesley et al., 1990a, Zheng et al., 1995). This indicates that the topography of the CD44 epitopes and their orientation toward the HA binding site dictate the ability of the relevant antibody to interfere with
262
David Naor et d.
HA adherence. Perhaps even more important, not all CD44-expressing cells are able to bind HA, although some of them acquire this property after activation or chemical modification (see Sections V.F and V.G). A similar type of activation-dependent upregulation of ligand binding has been observed in integrins (O’Toole et al., 1990; O’Toole, 1995).
C. Binding of HA to CD44 Variants The ability of human hematopoietic CD44 (CD44H or CD44s) to bind H A either constitutively or after activation is well established. Such consensus cannot, however, be extended to epithelial CD44 (CD44E, CD44V8-10) and other CD44 variants. A CD44- BL cell line (Namalwa) transfected with human CD44E, CD44V6-10, or CD44V7-10 cDNAs did not display considerable HA-dependent adhesion to lymph node high endothelial cells (Stamenkovic et al., 1991; Bartolazzi et al., 1995). Similarly, the Namalwa transfectants, as well as a melanoma cell line transfected with human CD44E cDNA, did not significantly bind to plastic-immobilized or soluble HA (Sy et al., 1991; Thomas et al., 1992; Bennett et al., 1995b; van der Voort et al., 1995). Insertion of CD44V3-10 or CD44V3,8-10 also failed to confer on B-lymphoma cells the ability to efficiently bind HA (Bartolazzi et al., 1995; Jackson, D. G., et al., 1995; van der Voort et al., 1995). In contrast, when transfected with CD44H cDNA, the same cell lines did bind HA. In addition, CD44E-lg, in contrast to the CD44H-lg fusion protein, does not interact with immobilized HA, as indicated by enzyme-linked immunosorbent assay (ELISA) (Peach et al., 1993; Bennett et al., 1995b). Phorbol ester treatment induced cells transfected with CD44H, but not with CD44E, cDNA to bind HA (Liao et al., 1993; van der Voort et al., 1995). Contradictory results were obtained, however, when a CD44- AKRl cell line of mouse T-cell lymphoma was transfected with the mouse analog of human CD44E cDNA. The mouse CD44V8-10 transfectants, as well as mouse transfectants expressing other CD44 variants, exhibited significant binding to an HA-bearing cell line (He et al., 1992). In addition, a rat pancreatic adenocarcinoma cell line transfected with CD44V4-7 (pMeta-1) (see Fig. 1) displayed enhanced binding of soluble HA (Sleeman et al., 1996). Lymphocytes of pMeta-1-transgenic mice expressing the CD44V4-7 transgene were, however, unable to bind HA (Sherman et al., 1994). Collectively, these results prove that the insertion of additional exon products does not interfere with HA binding. Furthermore, it was found that, in activated human T cells, antibody-induced modulation of V6 or V9 CD44-encoded epitopes markedly reduces their ability to adhere to immobilized HA (Galluzzo et al., 1995), suggesting, rather, that the expression of variant exons enhances the CD44-ligand interaction. These seemingly conflicting findings can be rec-
CD44 and the Malignant Process
263
onciled by assuming that the specific internal cell environment influences the ability of the CD44 variants to interact with HA. It is likely that different cells synthesize analogous CD44 variants (e.g., CD44E) decorated with distinct glycosylation or glycanation patterns, some of which interfere with HA binding. An experiment supporting this assumption is described in Section V.G. Alternatively, a “binder” CD44 variant may differ from its “nonbinder” analog by the replacement of a few amino acids. Indeed, Dougherty and colleagues (1994) reported that a CD44 variant that differs from human CD44E by three amino acid substitutions was able to bind HA.
D. The Topography of the CD44 HA Binding Sites CD44 belongs to a group of proteins (also known as hyaladherins; Knudson and Knudson, 1993) that share the ability to interact with HA. Included in this group are the receptor for hyaluronate-mediated motility (RHAMM; Turley et al., 1991); cartilage link protein; proteoglycan core protein (aggrecan) (Hardingham and Fosang, 1992); fibroblast versican (LeBaron et al., 1992); hyaluronectin (Delpech and Halavent, 1981); the Ivd4 epitope of endothelial cells (Banerjeeand Toole, 1992);TSG-6, a TNFainducible protein (Leeet al., 1992);and ICAM-1 (the latter was purified from an HA affinity column but the extracellular domain was not analyzed for HA binding) (McCourt et al., 1994). The suggestion has been advanced that these proteins include regions containing positively charged amino acids (arginine and lysine), which interact with the negatively charged hexuronate groups of HA (Hardingham and Fosang, 1992). Each HA binding cluster consists of two basic amino acids, either arginine or lysine, separated by seven nonacidic amino acids (B[X7]B motif) (Yang et al., 1994). In addition, part of the HA binding proteins (i.e., the proteoglycan core and link proteins) contain conserved cysteine residues, which possibly form disulfidebridged loops that are involved in HA binding (Goetinck et al., 1987). The interspecies homologous region of the CD44 extracellular domain (Fig. 1A) also contains six conserved cysteines (Goldstein et al., 1989; Nottenburg et al., 1989; Zhou et al., 1989) that may form a single globular domain (Goldstein et al., 1989)or a structure consisting of three loops (Zheng et al., 1995). At least part of the HA binding capacity of CD44 is confined to these loop(s) (Liao et al., 1993; Zheng et al., 1995). In contrast, the HA binding capacity of the second subgroup of proteins (e.g., RHAMM) is reduction resistant (Hoare et al., 1993), suggesting that their ligand binding is not dependent on an S-S-bonded globular domain. The complete sequencing of the CD44 molecule has allowed the identification of two extracellular regions containing a relatively high number of positively charged amino acids (arginine and lysine) known to be important
264
David Naor et al.
for HA binding by other proteins. Using truncation and site-directed mutagenesis of the human CD44 sequence, Peach and colleagues (1993) found that these two regions are involved in HA binding. One of the clusters (amino acids 21-45 of the primary human CD44 sequence) is included within the link protein homologous region (Fig. lA, circle), where arginine at position 41 predominantly affects HA binding. The second cluster (amino acids 144-167) is located outside the link protein homologous region (Fig. lA, ellipse), where mainly the combination of lysine at position 158 and arginine at position 162 influences HA binding. Both clusters are situated within the interspecies homologous region (Fig. lA, black track at the amino terminus), which is located in the amino-terminal two thirds of the CD44 extracellular domain. Truncation mutagenesis of the membrane-proximal region (Fig. lA, the light shaded track), which follows the conserved amino terminus of human (Peach et al., 1993) or mouse (He et al., 1992) CD44, does not influence HA binding. The membrane-proximal region is the least conserved among species. Using mAbs recognizing defined epitopes of murine CD44s in order to block HA binding to CD44s-expressing cells, Zheng and colleagues (1995) showed that antibodies whose specificity was dictated by a stretch of eight amino acids located at positions 46-53 (of the mature protein) interfere with HA binding. The same effect was achieved with antibodies whose specificity is influenced by histidine at position 83 or valine and threonine at positions 90 and 91, respectively. Although these sites are included in the cartilage link protein homology region, none of them is located in the HA binding cluster described by Peach and colleagues (1993). However, antibodies directed against one CD44 epitope may change the conformation of the molecule so that the HA binding site, which is normally exposed at a different location, is no longer accessible. Another interesting point, deriving from the work of Zheng and colleagues (1995),is related to the fact that different cell types display distinct CD44 epitopes, which, after interacting with the relevant antibody, influence HA binding sites. In conclusion, it appears that the HA binding machinery of CD44 is dependent on two basic amino acid motifs, critically spaced disulfide bonds forming the globular domain(s) (Neame et al., 1986), expression of variant exons (see Section V.C), the conformation and integrity of the entire molecule, and the glycosylation and glycanation status of the protein (Section V.G).
E. Influence of the CD44 Cytoplasmic Tail and Cytoskeleton-Related Proteins on HA Binding to CD44 The observation that the CD44 cytoplasmic tail interacts with cytoskeletal proteins (Section 1II.D) raised the question of cytoskeleton involvement in HA binding. In this connection, a stretch of 15 amino acids (residues
CD44 and the Malignant Process
265
305-320 of the primary mouse protein) was assigned to the interaction between the CD44 cytoplasmic tail and ankyrin, in view of the finding that the corresponding synthetic amino acid sequence exhibited specific binding to this cytoskeleton-related protein (Lokeshwar et al., 1994). Transfection of Jurkat of COS cells with CD44 lacking the cytoplasmic sequence that contains the ankyrin binding segment resulted in 90% reduction of the cell's ability to bind soluble HA or to adhere to plastic-immobilized HA (Liao et al., 1993; Lokeshwar et al., 1994). Additionally, immunofluorescent staining showed that HA induces patching and capping of CD44 molecules on the surface of mouse T lymphoma cells. Double staining revealed ankyrin accumulation underneath the capped structure of the CD44 HA receptor, suggesting that ankyrin is involved in linking CD44 to the cytoskeleton contractile system. The notion that HA binding and HA-induced receptor capping are dictated by signals delivered from the cytoskeleton is further supported by the observation that both events are inhibited by the microfilament blocker cytochalasin or the calmodulin blocker W-7 (Bourguignon et al., 1993; Galluzzo et al., 1995). Cytoskeleton involvement is also important for the function of a number of other adhesion molecules (such as cadherins) (Geiger and Ayalon, 1992). As opposed to the previously mentioned possibility that interaction with ankyrin is essential to the HA binding function of CD44 (Lokeshwar et al., 1994), another study demonstrated that deletion of the ankyrin binding segment did not prevent HA binding, as indicated by cDNA cell transfection and a ligand binding assay (Perschl et al., 1995b). However, it should be noted that the truncated CD44 molecules were inserted into different types of cells (COS versus AKRl ), and, therefore, influenced by distinct intracellular environments. Further support that CD44-related HA binding may be cytoskeleton independent is provided by experiments with transfected AKRl cells in which all the inserted CD44 was detected in the Triton X-100soluble fraction, yet the cells bound HA (Uff et al., 1995). In addition, incubation with cytochalasin B did not affect the adherence of CD44' M o l t 4 cells to immobilized HA (Murakami et al., 1994). As for cytoskeletal involvement in CD44 binding function, there are also contradictory findings regarding the ability of cells transfected with tailless CD44 constructs to adhere to immobilized HA. Three reports imply that cells expressing tailless CD44 do not adhere, or adhere only marginally, to immobilized HA (Lokeshwar et al., 1994; Pur6 et al., 1995; Uff et al., 1995), whereas other studies maintain just the opposite (Lesley et al., 1992; Thomas et al., 1992). Interestingly, these seemingly conflicting observations even extend to transfectants of the same cell type (i.e., AKR-1 cells; compare Lesley et al., 1992, with Pure et al., 1995). On the other hand, it is clear that cells expressing tailless CD44 do not bind soluble HA, whereas under identical conditions those expressing the wild-type CD44 possess this ability (Lesley et al., 1992, 1993b; Liao et al., 1993; Lokeshwar et al., 1994; Uff et al.,
266
David Naor et a/.
1995). It has been shown that, after treatment with enhancing (agonistic) anti-CD44 mAb (e.g., IRAWB14 mAb) or its multimeric (but not monomeric) Fab fragments (Lesleyet al., 1993b), even tailless CD44 transfectants that cannot interact with soluble (Lesley et al., 1992)or immobilized (PurCet al., 1995) HA acquired the ability to bind the ligand. In addition, transfectants expressing the disulfide-bonded dimer of CD44 (constructed by genetic replacement of the CD44 transmembrane region with the corresponding region of the CD3-5 chain), efficiently bound soluble HA, even in the absence of the cytoplasmic domain (Perschl et al., 1995b). It bears mention that the transmembrane region of the CD3-5 chain contains a cysteine residue that forms an intermolecular disulfide bond, generating functional 5-5 homodimers or heterodimers (Weissman et al., 1988). Similarly, cytoskeletonmediated dimerization of CD44 molecules, through disulfide bonding, may allow HA binding under normal conditions. Indeed, Jurkat cells transfected with a human CD44 construct, in which cysteine-286 (primary protein) of the transmembrane domain had been replaced with alanine, failed to bind HA after stimulation with anti-CD3 mAb, whereas those transfected with the wild-type construct displayed efficient binding under identical conditions (Liu and Sy, 1996). Collectively, these results imply that the CD44cytoskeleton interaction is not an absolute requirement for HA binding, as this interaction can be bypassed by cross-linking the CD44 receptor (see later). In summary, outside-in signals provided by anti-CD44 antibody may allow clustering of tailless CD44 molecules on the cell membrane, thus conferring on them the ability to bind HA. The same distributional change may be induced in tailless CD44 by HA immobilized onto plastic or by artificial dimerization of the transmembrane domain and, in wild-type CD44, by inside-out signals delivered from the cytoskeleton. In any event, a threshold level of cell surface CD44 expression is required before HA binding is observed. At concentrations above this level, the amount of bound HA rises as the level of CD44 increases. This correlative effect has been observed in some (Perschl et al., 1995b; Uff et al., 1995), but not in all (Hyman et al., 1991; Galandrini et al., 1994b), cell lines. The primary mouse CD44 sequence is constitutively phosphorylated on serines 325 and 327 of the cytoplasmic domain, even though 10 potential phosphorylation sites are available (Pure et al., 1995). PurC and colleagues demonstrated that transfected AKR-1 cells expressing mutated mouse CD44, with serine substitutions at positions 325 or 327 to prevent phosphorylation, were defective in HA binding and ligand-induced receptor modulation. The ability to bind HA was restored by the addition of IRAWB14 anti-CD44 mAb, suggesting the inability of dephosphorylated CD44 to form clusters, which are essential to the adhesion function. Uff and colleagues ( 1995) reported contradictory findings: AKR-1 cell transfectants expressing human CD44 mutated at positions 323 and 325 (corresponding to positions