Characterization of viral-cellular fusion transcripts in a large ... - Nature

9 downloads 0 Views 418KB Size Report
Database analyses of the cellular parts of these fusion transcripts revealed 51 ... HPV sequences integrated into cancer related genes and close to fragile sites, ...
Oncogene (2002) 21, 419 ± 426 ã 2002 Nature Publishing Group All rights reserved 0950 ± 9232/02 $25.00 www.nature.com/onc

Characterization of viral-cellular fusion transcripts in a large series of HPV16 and 18 positive anogenital lesions Nicolas Wentzensen1,4, Ruediger Ridder2,4, Ruediger Klaes1,3, Svetlana Vinokurova1, Ulrike Schaefer1 and Magnus von Knebel Doeberitz*,1 1

Division of Molecular Pathology, Department of Pathology, University of Heidelberg, Im Neuenheimer Feld 110, 69120 Heidelberg, Germany

Persistent high risk type human papillomavirus (HR ± HPVs) infections induce dysplasia or cancer of the anogenital tract, most notably of the uterine cervix. The viral genome usually persists and replicates as an episomal molecule in early dysplasia, whereas in advanced dysplasia or cervical cancer HPV genomes are frequently integrated into the chromosomal DNA of the host cell. Previous studies suggested that modi®cation of critical cellular sequences by integration of HPV genomes might signi®cantly contribute to the neoplastic transformation of anogenital epithelia (insertional mutagenesis). This prompted us to characterize the integration loci of high risk HPV genomes in a large set of genital lesions. We ampli®ed E6/E7 oncogene transcripts derived from integrated HPV16 and HPV18 genomes and characterized in detail the co-transcribed cellular sequences of 64 primary genital lesions and ®ve cervical cancer cell lines. Database analyses of the cellular parts of these fusion transcripts revealed 51 di€erent integration loci, including 26 transcribed genes (14 known genes, 12 EST sequences with unknown gene function). Seventeen sequences showed similarity to repetitive elements, and 26 sequences did not show any database match other than genomic sequence. Chromosomal integration loci were distributed over almost all human chromosomes. Although we found HPV sequences integrated into cancer related genes and close to fragile sites, no preferential site or integration motif could be identi®ed. These data demonstrate that target directed insertional mutagenesis might occur in few HPV-induced anogenital lesions, however, it is rather the exception than the rule. Oncogene (2002) 21, 419 ± 426. DOI: 10.1038/sj/onc/ 1205104 Keywords: human papillomavirus (HPV); E6 ± E7 oncogenes; integration; insertional mutagenesis; fusion transcripts *Correspondence: M von Knebel Doeberitz; E-mail: [email protected] and [email protected] Current addresses: 2MTM Laboratories AG, 69120 Heidelberg, Germany; 3Institute of Human Genetics, University of Heidelberg, Germany; 4 These authors contributed equally to this paper Received 30 August 2001; revised 9 October 2001; accepted 29 October 2001

Introduction Anogenital cancers, in particular cervical cancers, are strongly associated with persistent infections by high risk human papillomaviruses (HR ± HPVs), most notably of types 16, 18, 31, 33, 35 and 58 (Walboomers et al., 1999). These cancers usually evolve through a distinct series of preneoplastic lesions, referred to as cervical intraepithelial neoplasia grade 1 to 3 (CIN1-3). HR ± HPV types encode two genes, E6 and E7, with well characterized oncogenic activities (zur Hausen, 1999). The expression levels of these viral oncogenes raise consistently with increasing dysplasia, suggesting that enhanced expression levels of the viral gene products are linked to advanced epithelial dysplasia (Durst et al., 1992; Stoler et al., 1992). Moreover, various experimental models demonstrated that growth of anogenital cancer cells relies on continuous expression of the viral oncogenes (von Knebel Doeberitz et al., 1988). This clearly documents the essential role of the viral oncogenes E6 and E7 in induction, progression, and maintenance of the neoplastic phenotype of anogenital cancer cells. The gene products encoded by the viral oncogenes interact with various cellular factors, most of which are involved in the regulation of the cell cycle and genomic homeostasis. Viral E6 protein induces the premature degradation of p53, thereby continuously consuming recruitable p53 in HR ± HPV-infected cells (Sche€ner et al., 1994), whereas the E7 protein interacts with the hypophosphorylated retinoblastoma gene product (pRb), which results in release of E2F-type transcription factors from their binding to pRB (Sche€ner et al., 1994). Release of E2F-type transcription factors leads to subsequent activation of G1/S phase promoting genes and induces inappropriate cell cycle progression. Inactivation of p53 and pRB gene products by the viral oncoproteins E6 and E7 thus results in impaired cell cycle control, reduced apoptotic activity through reduced p53 activities, and consequently in genomic instability. More recent reports further demonstrated that E6 and E7 also cooperate to induce disturbances of the mitotic spindle apparatus resulting in chromosomal missegregation and signi®cant recombination (Duensing et al., 2000, 2001). Thus, over time, increasing chromosomal losses and gains, chromoso-

HPV 16 and 18 integration sites N Wentzensen et al

420

Table 1 Summary of all analysed transcripts ID int37 int39 int34 int60 int23 int52 int53 int54 int57 int67 int68 int69 int1 int2 int3 int5 int6 int8 int9 int10 int17 int19 int29 int30 int31 int33 int36 int41 int42 int43 int44 int45 int46 int56 int58 int61 int62 int63 int70 int71 int72 int73 int74 int76 int78 int79 int20 int18 int21 int32 int50 int4 int47 int48 int49 int11 int13 int15 int16 int25 int26 int27 int35 int40 int55 int64 int65 int66 int14 a c

Oncogene

HPV

Pathology

Size (nt)

Transcripta

Database comparisonb

Map

Fragile sitec

16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18

SiHa Caski CIN 2 CIN 2 CIN 3 CIN 3 CIN 3 CIN 3 CIN 3 CIN 3 CIN 3 CIN 3 Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca VaIN 2 VIN 2 VIN 3 VIN 3 VIN 3 VIN X HeLa C4-1 SW756 Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca Ca VaIN 3

643 65 248 350 191 1800 370 1450 600 400 500 850 592 211 1500 597 256 1300 271 193 213 456 200 337 219 245 319 402 263 450 653 219 894 600 500 520 600 500 475 519 1500 487 1400 489 400 900 260 278 307 1500 1450 360 570 429 211 489 672 600 411 615 213 345 68 240 180 400 550 620 202

A B (3728) A A A B (3530) A A B (3510) A A C (810) A B (3547) A A A A A A A A C (1245) A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A C (1138) A A A A C (1011)

repeat ALU no match no match repeat L1 no match no match repeat MLTC1 CEACAM5 HS220529 no match repeat L1 PTPN13 HS211595 EST HS205126 EST HS57811 no match repeat HERV gp70 FANCC HS37953 EST HS97790 no match WASF2 HS288908 no match repeat ALU 80% HS74115 NR4A2 HS82120 no match repeat ALU EST HS23106 KIAA0130 no match no match EST HS162183 repeat ALU no match EST AA084805 KLHL3 HS7388 repeat ALU no match intron AFP EST HS195730 repeat mariner transposon no match GLS HS239189 EST A1555655 RPS27 HS195453 no match no match FER1L3 HS43087 no match repeat L1 no match TCP1 HS4112 no match KPNA3 HS3886 TP63 HS137569 EST HS281235 repeat MER 80% BF688062 repeat ALU EST HS12677, CGI-147 repeat ALU EST HS127775 MYC HS79070 no match repeat ALU no match repeat ALU no match repeat HERV LTR no match CHS1 HS36508 EST HS150715

13q21 no match 10q26 repetitive 4 2q31 repetitive 19q13 13q22 13q21 4q21 1q32 14q24.1 4p16 2p24 9q22 20p11 20p12 1p36 13q21.2 5q35 10p15 2q24 13q21 repetitive 17q12 9q34 8q24 8p23 repetitive 9p13 2q33 5q31 repetitive 13q21 16q12 15q15 repetitive 2q23 2q32 3q21 1q21 7q21 6q21 10q23 10q22 21q22 4q31 6q25 3q28 13q14 3q28 8q24 8q21 no match repetitive 17q23 21q21 8p11.2 8q24 3q28 12q21 4q21 repetitive 14q13 repetitive 3p21 1q43 15q12

FRA13B C/BrdU FRA10F C/Aph FRA2G C/Aph FRA19A C/5-aza FRA13B C/BrdU FRA14C FRA4A FRA2C FRA9D FRA20A FRA20B FRA1A FRA13B FRA5G

C/Aph C/Aph C/Aph C/Aph R/Fol C/Aph C/Aph C/BrdU R/Fol

FRA2K R/Fol FRA13B C/BrdU

FRA8C C/Aph

FRA2I C/Aph FRA5C C/Aph FRA13B C/BrdU FRA16B R/Dist FRA2K R/Fol FRA2H C/Aph FRA3F not classified FRA1F C/Aph FRA7E C/Aph FRA6F C/Aph FRA10A R/Fol FRA10D C/Aph FRA4C C/Aph FRA6E C/Aph FRA3C C/Aph FRA3C C/Aph FRA8C C/Aph FRA8B C/Aph FRA17B C/Aph FRA8C C/Aph FRA3C C/Aph FRA12 C/Aph

FRA1I C/Aph

Number indicates last viral nucleotide before transition to cellular sequence. bHUGO gene name and Unigene cluster/EST accession number. C=common, R=rare, Aph=Aphidicolin, F=Folate, BrdU=5-bromo-2'-deoxyuridine, 5-aza=5-azacytidine

HPV 16 and 18 integration sites N Wentzensen et al

421

Figure 1 Episome and integrate derived HR ± HPV transcript types and their frequencies as revealed by the APOT assay. Type A shows E1 sequences spliced directly to cellular ¯anking sequence, type B shows E1 spliced to E4 sequences and then running into cellular ¯anking sequence, and type C transcripts are not spliced, but directly read through from viral to cellular sequences within the E1 gene

mal translocations and inappropriate segregation of mitotic spindles accumulate in HR ± HPV infected cells expressing the viral oncogenes E6 ± E7 (Lazo, 1999). One very obvious consequence of increased genomic instability and recombinatorial activity in E6 ± E7 transformed cells is the integration of HPV genome fragments into host cell chromosomes (Cullen et al., 1991; Kessis et al., 1996). The structure of the integrated viral genome copies in advanced dysplastic lesions or invasive cervical cancers permit transcription of the transforming E6 and E7 genes, whereas other viral genes might be inactivated, deleted, or uncoupled from their respective transcriptional regulatory elements (Schwarz et al., 1985; Shirasawa et al., 1989). A recent study showed that in about 16% of CIN 3 lesions and in more than 80% of invasive cancer samples E6 ± E7 mRNAs were transcribed from integrated viral genome copies, whereas in CIN 2 lesions only 5% and in CIN 1 lesions less than 1% of clinical samples expressed integrate derived E6 ± E7 mRNA transcripts, respectively (Klaes et al., 1999). The predominant expression of integrated viral genomes in advanced dysplasia or cervical cancer cells suggests that integration by itself might favor the neoplastic growth by conferring an additional selective advantage. This selective growth advantage might rely on various molecular mechanisms: (i) Use of transcriptional control elements located in adjacent cellular regions close to the viral integration site could override transcriptional control of the viral oncogenes by viral control elements and therefore release transcription of the integrated E6 ± E7 genes from transcriptional inhibition mediated by the viral E2 gene product. This in turn might contribute to enhanced transcription of integrated viral genomes and thus result in increased levels of the growth promoting viral proteins (von Knebel Doeberitz et al., 1991). (ii) The fusion of HPV E6 ± E7 encoding oncogene transcripts to transcribed cellular sequences may confer an increased stability to

the E6 ± E7 transcripts resulting also in higher levels of the viral oncoproteins and thus enhanced oncogenic activity (Jeon and Lambert, 1995). (iii) Conversely, viral sequences might also modify structure and function of cellular genes and thereby trigger transformation. Integration of HPV DNA in genomic regions encoding tumor suppressive genes thus might lead to functional inactivation of these genes by insertional mutagenesis. This mechanism was shown in ME180 cervical cancer cells, in which the APM-1 gene is disrupted through integration of HPV68 sequences, resulting in loss of function of the APM1 gene product (Reuter et al., 1998). In addition, several reports of papillomavirus integration close to cellular protooncogenes support the hypothesis that transcriptional activation of the respective cellular genes might further contribute to HPV associated carcinogenesis (Couturier et al., 1991; Durst et al., 1987). These incidental observations thus suggested that targeting speci®c genomic loci by integrated HPV genomes might further contribute important oncogenic functions to the pathogenesis of HR ± HPV associated cancers. Most more detailed studies on the structural analysis of integrated viral genomes in cervical carcinoma cells or their precursors have been performed on a limited number of cervical cancer cell lines including the wellcharacterized HPV18 positive cell lines HeLa, SW756, and C4-1, as well as the HPV16 positive cell lines Caski and SiHa (Couturier et al., 1991; el Awady et al., 1987; Mincheva et al., 1987; Popescu et al., 1987). Due to the limited material available from primary HPV associated cancers and their precursors and the complex technology required to characterize the integration sites in detail, only few integration sites have been suciently investigated in primary lesions. Here we used a recently described simpli®ed ampli®cation technique (APOT assay) (Klaes et al., 1999) to comprehensively determine the structure and sequences of HPV E6 ± E7 oncogene transcripts derived from Oncogene

HPV 16 and 18 integration sites N Wentzensen et al

422

Figure 2 Examples of fusion transcripts containing parts of known genes. The ®rst line shows the fusion transcript in 5'?3' direction. The second line shows the respective genomic sequence including the integrated HPV genome. The question mark (?) indicates varying cellular sequences between splice donor (viral) and splice acceptor (cellular) sites. Red boxes indicate viral sequences. Green boxes indicate exons. Blue boxes indicate intronic sequences. The black line indicates an ORF in frame with the HPV E1 gene. (a) shows a transcript spliced to exon 7 of the CEACAM5 gene. There is a further splicing in the transcript to exon 8. (b) shows a transcript spliced to exon 3 of the WASF2 gene. There is further splicing in the transcript to exon 4. (c) shows a transcript spliced to exon 6 of the FANCC gene, continuing into intronic sequences. (d) shows a fusion transcript spliced to a region located 5' of exon 1 of the c-myc gene. (e) shows a transcript spliced to the 3' region of exon 8 of the NR4A2 gene. There is a four base pair overlap between HPV and cellular sequences at the fusion site. (f) shows a transcript spliced to the opposite strand of the TP63 gene

integrated viral genomes in 64 biopsy samples. Genomic allocation and relation to known and expressed human sequences were identi®ed through an extensive comparative database search.

Results Structure of viral cellular ± fusion transcripts Transcripts encoding the viral oncogenes were ampli®ed and isolated as reported recently (Klaes et al., 1999) using a nested RT ± PCR protocol and oligonucleotide primers located within the HPV E7 open reading frame (5' end) and an oligo(dT)-adaptor primer (3' end). In total, 69 isolated fusion transcripts were characterized in detail. Five transcripts were derived from cervical cancer cell lines SiHa, Caski, HeLa, C4-1 and SW756, whereas the remaining 64 were ampli®ed from clinical specimens, including 47 Oncogene

cervical carcinomas, eight CIN 3 lesions, two CIN 2 lesions, one ValN 3 lesion, one ValN 2 lesion, three VIN 3 lesions, one VIN 2 lesion, and one VIN X lesion. The respective sizes of the cellular ¯anking sequences ranged from 65 to 1800 bp with an average size of 450 bp. Fifty specimens were positive for HPV16, and 14 specimens were positive for HPV18 (Table 1). Sequence analyses of the ampli®ed cDNA fragments revealed three di€erent types of viral ± cellular fusion transcripts (Figure 1): In 61 out of 69 transcripts the splice donor site at the 5' end of the E1 ORF was used to splice the viral sequences into adjacent cellular ¯anking sequences (Type A, Figure 1). These transcripts harbor the E7 sequences, but lack E4 sequences, which are replaced by co-transcribed cellular sequences at their 3' ends. In four samples, fusion transcripts were spliced using the E4 splice acceptor site followed by cellular ¯anking sequences (Type B, Figure 1). In the remaining four cases, the transcripts were not spliced at the E1 splice donor site,

HPV 16 and 18 integration sites N Wentzensen et al

423

Figure 3 Chromosomal distribution of HPV integration events. Numbers indicate individual fusion transcripts. Numbers with arrows were mapped to a chromosomal banding, whereas numbers without arrows were only mapped to the whole chromosome. Multiple numbers within one box represent multiple integration events at one chromosomal banding. `F' within a yellow box represents fragile sites at the respective integration locus. Red: known genes. Green: EST sequences. Blue: repetitive sequences

instead the transitions from HPV to cellular sequences were found at varying locations within the HPV E1 ORF (Type C, Figure 1), suggesting that in these cases no splicing event occurred within the viral E1 gene. A comparison of the transcribed cellular sequences enclosed in the viral ± cellular fusion transcripts with the genomic sequence listed in the databases revealed that four of 14 transcripts deriving from known genes were spliced at least one more time in the ¯anking sequence. To perform a more detailed analysis of the splicing mechanism of Type A transcripts, the chromosomal surrounding location 5' of the cellular ¯anking sequence was analysed for splice acceptor consensus sequences, showing a typical splice acceptor signal in all cases (data not shown). Characterization of the cellular flanking sequences of viral ± cellular fusion transcripts Database comparison of the cellular ¯anking sequences showed identity to known genes in 14 cases and to expressed sequence tags (EST) in 12 cases. Two sequences showed partial similarity, but no identity to known EST sequences. Seventeen transcripts were similar to repetitive sequence elements only. In 26 cases, no similarity to any expressed or repetitive sequence represented in current publically available databases was found. The fusion transcripts isolated from C4-1, HeLa and SW756 cervical cancer cell lines were identical to sequences described previously (Schneider-Gadicke and Schwarz, 1986). The identi®ed chromosomal locus of the cellular ¯anking sequence from SiHa cells corresponds to the locus found by el Awady et al. (1987). In Caski cells, however, only a short cellular sequence could be ampli®ed that did not show similarity to known sequences from the database (Baker et al., 1987).

The fusion transcripts encompassing parts of known genes showed di€erent patterns: four fusion transcripts were spliced into the coding sequence of the respective gene in frame with the E1 start codon. Five fusion transcripts were spliced into the coding sequence of the respective gene, but not in frame with the HPV E1 gene. Two fusion transcripts were spliced into noncoding sequences within the 3' untranslated part of the respective genomic mRNA. Two fusion transcripts were spliced to the opposite strand in the coding region, and one fusion transcript was spliced into an intronic region close to the transcription initiation site of the respective gene. Figure 2 shows representative examples for fusion transcripts containing coding and non-coding regions of known genes. Figure 2a shows an integrate derived transcript type that is spliced to exon 7 of the carcinoembryonic antigen (CEACAM5) gene in frame with the HPV E1 open reading frame. Figure 2b shows a transcript spliced to the coding sequence of WASF2, related to the Wiskott-Aldrich Syndrome Protein Family Member 1 (WASF1) gene, also in frame with HPV E1. Figure 2c shows a transcript spliced to exon 6 of the Fanconi Anemia Group C (FANCC) gene, not in frame with HPV E1. Figure 2d shows a fusion transcript spliced to the noncoding 5' region of the MYC oncogene. Figure 2e shows a fusion transcript spliced to the non-coding 3' region of the nuclear receptor of T cells gene (NR4A2). Figure 2f shows a transcript spliced to the opposite strand of the coding region of the TP63 gene. Other known genes a€ected by HPV integration in this study were: TCP1, KLHL3, KPNA3, CHS1, PTPN13, GLS, RPS27 and FER1L3 (Table 1). One transcript showed similarity to the env ± gene of an endogenous retroviral sequence, another transcript contained an endogenous retroviral LTR sequence. One transcript encompassed a mariner transposon-like sequence. The other repetiOncogene

HPV 16 and 18 integration sites N Wentzensen et al

424

tive elements found in this study were similar to ALU repeats in nine cases, to L1 repeats in three cases, and to mer4 and mltc1 repeats in one case each. Eight sequences that were not similar to known genes or EST sequences present in current databases showed a typical poly-adenylation signal (AAUAAA) at their 3' end, suggesting that they might originate from transcribed chromosomal areas. Chromosomal localization of HPV integration events The individual chromosomal localizations of the cellular ¯anking sequences were identi®ed by BLASTN comparisons to the whole genome data base (Altschul et al., 1990) in 58 cases. Fifty-seven of these sequences could be clearly assigned to de®ned chromosomal bandings. All chromosomes except for 11, 18, 22, and X were found to be targeted by HPV integration in the analysed clinical specimens and cervical cancer cell lines (Figure 3). Since there were several reports on foreign DNA integration into chromosomal structures in distinct genomic areas called fragile sites, we screened the neighboring sequences adjacent to the integration sites for the presence of these fragile sites. Thirty-one of the 47 di€erent chromosomal regions a€ected by HPV integration encompass known fragile sites, and distinct fragile sites were mapped to ®ve of the six loci that contained more than one integrated HPV genome fragment. In total, 40 of 57 mapped integration events occurred in the area of fragile sites. Twenty-®ve of 31 di€erent fragile sites belong to the group of common fragile sites, whereas ®ve correspond to the rare fragile sites. One fragile site was not further classi®ed. The majority (23 of 25) of the common fragile sites was aphidicolin-inducible, and four of the ®ve rare fragile sites were sensitive to folate. These distributions correlate with the general representation of fragile sites in the whole genome (Sutherland and Richards, 1999). Discussion Persistent infections with oncogenic HPV types may result in the integration of viral genome copies into chromosomes of the respective host cells. This event is frequently associated with a selective growth advantage for cells expressing mRNA transcripts derived from the integrated viral genome copies and suggests that integration of the viral genome plays a crucial role in progression of preneoplastic to neoplastic lesions. The detailed characterization of distinct integration sites in single cervical cancer biopsies or cell lines derived thereof indicated that disruption of critical genes might contribute to the transformation process. However, whether this is the rule or rather a rare exception in HPV mediated carcinogenesis has not been analysed so far. In previous studies, various techniques to isolate and characterize chromosomal integration sites have

Oncogene

been used. In situ hybridization using HPV probes provides a rough estimate on the chromosomal localization, but does not allow ®ne mapping or even gathering sequence information (Cannizzaro et al., 1988; Mincheva et al., 1987). PCR techniques were used to amplify genomic areas containing integrated HPV sequences employing either ALU sequences (Carmody et al., 1996) or speci®c restriction sites (Thorland et al., 2000) as primer binding sites for the ¯anking cellular sequences. Most recently, chromosomal integration sites were characterized by genomic restriction enzyme digestion, adaptor ligation, and PCR ampli®cation (Luft et al., 2001). We have used a previously described RT ± PCR assay which allows the ampli®cation of HPV oncogene transcripts derived from integrated viral genome copies (APOT assay) (Klaes et al., 1999). We ampli®ed a total of 69 viral ± cellular fusion transcripts, ®ve obtained from cervical cancer cell lines and 64 from primary clinical lesions. The cellular parts of the fusion transcripts were compared to publically accessible databases to identify those genes a€ected by HPV integration. Database comparison also revealed the chromosomal locus of the cellular ¯anking sequence in 58 cases. The ampli®cation of the APOT fragments from the ®ve cell lines allowed a good comparison with sequence data on fusion transcripts isolated in previous studies by other groups and demonstrated the general applicability of the approach chosen for this analysis. Among all analysed samples and transcripts, we did not ®nd the same gene a€ected by HPV integration in two independent lesions. We identi®ed a broad spectrum of genes a€ected by HPV integration and having di€erent physiological functions, comprising an interestingly high number of potentially tumor relevant genes, like CEACAM5, FANCC, MYC, TP63 and PTPN13. The CEACAM5 gene is coding for an immunoreactive glycoprotein usually expressed in fetal tissue, but frequently found to be upregulated in colon cancer (Macdonald, 1999). The FANCC gene product is assumed to contribute to repair processes of damaged DNA (Chen et al., 1996). Expression of MYC leads to inhibition of di€erentiation and increased growth rate in cell lines (Amati et al., 2001). MYC overexpression was found in many human tumors, especially in small cell lung carcinomas, neuroblastomas, and in retinoblastomas. TP63 shows strong sequence identity to the tumor suppressor gene TP53 and the related TP73 gene. TP63 was found to be overexpressed in head and neck cancer cell lines and in primary lung cancers (Hibi et al., 2000). PTPN13 is a Fas-associated phosphatase (FAP-1) that was shown to be an inhibitor of Fas-mediated apoptosis in human cancer cells (Li et al., 2000). In most of the cases, there were no continuous open reading frames from the HPV E1 gene into the cellular sequence. Since the E1 ORF is the third reading frame of the respective mRNA, it would generally permit translation of the respective fusion gene product, albeit with limited translation ecacy (Remm et al., 1999).

HPV 16 and 18 integration sites N Wentzensen et al

However, since open reading frames encoding potential fusion genes were found in only ®ve of the 14 known genes, and other fusion transcripts did not show any longer open reading frames in the cellular ¯anking sequence, it is most unlikely that expression of viral ± cellular fusion proteins regularly contributes to the HPV-mediated transformation process. Although we found an integration event in the regulatory region of the MYC gene, our data do not support a general mechanism of cellular transformation due to integration of transcriptionally active foreign DNA into transcriptionally silent chromosomal regions. Likewise, the complexity of integration sites observed in this study argues against a major and consistent contribution of insertional mutagenesis mediated disruption of critical genes in cervical carcinogenesis. Although our data do not formally rule out disruption of tumor suppressive genes through insertion of the viral genomes, its inconsistency and the obvious lack in the cases studied here strongly suggests that it may play only a minor role and may be restricted to a few selected cases. Nearly all chromosomes, with the exception of chromosomes 11, 18, 22 and X were targeted by HPV integration in the samples analysed in this study. Six chromosomal bandings were targets for more than one integration event, all regions harboring tumor associated genes: ®ve di€erent HPV integrations occurred at 13q21 (BRCAX breast cancer susceptibility gene (Hopper, 2001)), three di€erent HPV integrations occurred around 2q32 (LOC51655 ras related protein; Tu and Wu, 1999), at 3q27 (B-cell lymphoma associated translocation; Baron et al., 1993) and 8q24 (MYC), two di€erent HPV integrations were found at 4q21 (GRO 1 ± 3 oncogenes; Haskill et al., 1990) and 9q33 (TNFSF8 tumor necrosis factor ligand superfamily; Smith et al., 1993). All these chromosomal regions except for 4q21 harbor known fragile sites. Such fragile sites represent reproducibly expressed decondensations on mitotic chromosomes after treatment with speci®c substances observed under cytological conditions. They are divided into rare (less than 5% of the population) and common (part of the normal chromosomal structure) sites based on the frequency they occur and further classi®ed in terms of the chemical agent used to induce their expression (Sutherland and Richards, 1999). While speci®c nucleotide repeats were found in the area of rare fragile sites, no consensus sequences could be found for common fragile sites. These common sites seem to represent late replicating genomic areas (Sutherland and Richards, 1999). Genes in or near fragile sites are susceptible for chromosomal disruption and for foreign DNA integration. There was a high correlation between the individual locations of HPV integration sites and fragile sites in the genome since 40 of 57 mapped HPV integration sites were found near fragile sites. These observations con®rm data of other investigators, who have reported HPV integration sites near fragile sites in various cell lines and very few cervical carcinoma samples

(Popescu et al., 1987; Smith et al., 1992; Thorland et al., 2000). Although the exact mechanism of integration of HR ± HPV genomes remains still unclear and requires further studies on the molecular level, our data support the assumption that HPV integration during the pathogenesis of HPV infected anogenital cancers is a rather random process. The viral genomes appear to integrate non-selectively into chromosomal regions, in which the access for foreign DNA is facilitated, such as transcriptional active areas with a loose chromatin structure or as fragile sites with a susceptibility to DNA strand breaking. HR ± HPV virions infect many cells, and subsequently high amounts of viral DNA is replicated in epithelial cells exposed to the virus and genotoxic agents, and the probability of DNA strand breaks and chromosomal rearrangements is fairly high. This suggests that recombination between viral and cellular sequences might occur more often than commonly anticipated. However, a selective advantage mediating the preferred outgrowth as dysplastic or neoplastic lesions is obviously mediated by the enhanced expression of the viral E6 and E7 oncogenes, which in turn is a consequence of the deregulated transcription of chromosomally integrated HR ± HPV E6 ± E7 oncogenes (Jeon and Lambert, 1995; Klaes et al., 1999). Besides this highly consistent aspect of the chromosomal integration, the random selection of the chromosomal integration site occasionally might indeed target critical genes, which could further support neoplastic progression. However, the data presented here clearly demonstrate that this is rather the exception than the rule.

425

Materials and methods Cervical carcinoma cell lines and cervical swab and biopsy samples Human cervical carcinoma cell lines containing either HPV16 (SiHa, Caski) or HPV18 (HeLa, C4-1, SW756) were grown in Dulbecco's Modi®ed Eagle's Medium (Invitrogen, Karlsruhe, Germany) supplemented with 10% fetal bovine serum and antibiotics under standard conditions. Clinical samples including biopsies and cervical swabs were collected from outpatients attending the University Departments of Obstetrics and Gynecology in Jena, Heidelberg and Ulm, Germany, upon informed consent. Cervical swabs were collected with cytobrushes, and nucleic acids were extracted from these samples as described earlier (Klaes et al., 1999). DNA was used to determine the HPV type using PCR and RFLP as described elsewhere (Meyer et al., 1995). APOT amplification of fusion transcripts HPV oncogene transcripts were ampli®ed as described before (Klaes et al., 1999). Brie¯y, reverse transcription was performed using an adaptor linked oligo(dT)-primer (Frohman, 1993), followed by a semi-nested PCR using HPV E7 speci®c 5' primers and oligo(dT) and adaptor primers (3'). PCR products were blotted onto nylon membranes and hybridized with HPV E7 and E4 speci®c probes to discriminate episomal from integration derived transcripts. Oncogene

HPV 16 and 18 integration sites N Wentzensen et al

426

Cloning and sequence analysis of viral ± cellular fusion transcripts Integrate derived amplimers were cloned into pCR 2.1 vector using the TA cloning kit (Invitrogen). Nucleotide sequences were determined by the dideoxy chain termination method using Cy5-¯uorescent labelled primers (M13 forward and M13 reverse, Amersham Pharmacia Biotech, Uppsala, Sweden) according to standard protocols. Sequen-

cing reactions were analysed using an automated laser ¯uorescence DNA analysis system (Amersham Pharmacia Biotech). The obtained sequences were analysed using the Heidelberg Unix Sequence Analysis Resources (HUSAR) and the BLASTN-program provided by the National Cancer Institute. Nucleotide sequences were compared to all available nucleotide sequence databases (EMBL, Genbank) and protein databases (Swiss-Prot).

References Altschul SF, Gish W, Miller W, Myers EW and Lipman DJ. (1990). J. Mol. Biol., 215, 403 ± 410. Amati B, Frank SR, Donjerkovic D and Taubert S. (2001). Biochem. Biophys. Acta., 1471, M135 ± M145. Baker CC, Phelps WC, Lindgren V, Braun MJ, Gonda MA and Howley PM. (1987). J. Virol., 61, 962 ± 971. Baron BW, Nucifora G, McCabe N, Espinosa III R, Le Beau MM and McKeithan TW. (1993). Proc. Natl. Acad. Sci. USA, 90, 5262 ± 5266. Cannizzaro LA, Durst M, Mendez MJ, Hecht BK and Hecht F. (1988). Cancer Genet. Cytogenet., 33, 93 ± 98. Carmody MW, Jones M, Tarraza H and Vary CP. (1996). Mol. Cell Probes, 10, 107 ± 116. Chen M, Tomkins DJ, Auerbach W, McKerlie C, Youssou®an H, Liu L, Gan O, Carreau M, Auerbach A, Groves T, Guidos CJ, Freedman MH, Cross J, Percy DH, Dick JE, Joyner AL and Buchwald M. (1996). Nat. Genet., 12, 448 ± 451. Couturier J, Sastre-Garau X, Schneider-Maunoury S, Labib A and Orth G. (1991). J. Virol., 65, 4534 ± 4538. Cullen AP, Reid R, Campion M and Lorincz AT. (1991). J. Virol., 65, 606 ± 612. Duensing S, Duensing A, Crum CP and Munger K. (2001). Cancer Res., 61, 2356 ± 2360. Duensing S, Lee LY, Duensing A, Basile J, Piboonniyom S, Gonzalez S, Crum CP and Munger K. (2000). Proc. Natl. Acad. Sci. USA, 97, 10002 ± 10007. Durst M, Croce CM, Gissmann L, Schwarz E and Huebner K. (1987). Proc. Natl. Acad. Sci. USA, 84, 1070 ± 1074. Durst M, Glitz D, Schneider A and zur Hausen H. (1992). Virology, 189, 132 ± 140. el Awady MK, Kaplan JB, O'Brien SJ and Burk RD. (1987). Virology, 159, 389 ± 398. Frohman MA. (1993). Meth. Enzymol., 218, 340 ± 356. Haskill S, Peace A, Morris J, Sporn SA, Anisowicz A, Lee SW, Smith T, Martin G, Ralph P and Sager R. (1990). Proc. Natl. Acad. Sci. USA, 87, 7732 ± 7736. Hibi K, Trink B, Patturajan M, Westra WH, Caballero OL, Hill DE, Ratovitski EA, Jen J and Sidransky D. (2000). Proc. Natl. Acad. Sci. USA, 97, 5462 ± 5467. Hopper JL. (2001). Breast Cancer Res., 3, 154 ± 157. Jeon S and Lambert PF. (1995). Proc. Natl. Acad. Sci. USA, 92, 1654 ± 1658. Kessis TD, Connolly DC, Hedrick L and Cho KR. (1996). Oncogene, 13, 427 ± 431. Klaes R, Woerner SM, Ridder R, Wentzensen N, Duerst M, Schneider A, Lotz B, Melsheimer P and von Knebel DM. (1999). Cancer Res., 59, 6132 ± 6136. Lazo PA. (1999). Br. J. Cancer, 80, 2008 ± 2018. Li Y, Kanki H, Hachiya T, Ohyama T, Irie S, Tang G, Mukai J and Sato T. (2000). Int. J. Cancer, 87, 473 ± 479.

Oncogene

Luft F, Klaes R, Nees M, Durst M, Heilmann V, Melsheimer P and von Knebel Doeberitz M. (2001). Int. J. Cancer, 92, 9 ± 17. Macdonald JS. (1999). Semin. Oncol., 26, 556 ± 560. Meyer T, Arndt R, Stock¯eth E, Flammann HT, Wolf H and Reischl U. (1995). Biotechniques, 19, 632 ± 639. Mincheva A, Gissmann L and zur Hausen H. (1987). Med. Microbiol. Immunol. (Berl)., 176, 245 ± 256. Popescu NC, DiPaolo JA and Amsbaugh SC. (1987). Cytogenet. Cell Genet., 44, 58 ± 62. Remm M, Remm A and Ustav M. (1999). J. Virol., 73, 3062 ± 3070. Reuter S, Bartelmann M, Vogt M, Geisen C, Napierski I, Kahn T, Delius H, Lichter P, Weitz S, Korn B and Schwarz E. (1998). EMBO J., 17, 215 ± 222. Sche€ner M, Romanczuk H, Munger K, Huibregtse JM, Mietz JA and Howley PM. (1994). Curr. Top. Microbiol. Immunol., 186, 83 ± 99. Schneider-Gadicke A and Schwarz E. (1986). EMBO J., 5, 2285 ± 2292. Schwarz E, Freese UK, Gissmann L, Mayer W, Roggenbuck B, Stremlau A and zur Hausen H. (1985). Nature, 314, 111 ± 114. Shirasawa H, Tomita Y, Fuse A, Yamamoto T, Tanzawa H, Sekiya S, Takamizawa H and Simizu B. (1989). J. Gen. Virol., 70, 1913 ± 1919. Smith CA, Gruss HJ, Davis T, Anderson D, Farrah T, Baker E, Sutherland GR, Brannan CI, Copeland NG and Jenkins NA. (1993). Cell, 73, 1349 ± 1360. Smith PP, Friedman CL, Bryant EM and McDougall JK. (1992). Genes Chromosomes Cancer, 5, 150 ± 157. Stoler MH, Rhodes CR, Whitbeck A, Wolinsky SM, Chow LT and Broker TR. (1992). Hum. Pathol., 23, 117 ± 128. Sutherland GR and Richards RI. (1999). Am. J. Hum. Genet., 64, 354 ± 359. Thorland EC, Myers SL, Persing DH, Sarkar G, McGovern RM, Gostout BS and Smith DI. (2000). Cancer Res., 60, 5916 ± 5921. Tu Y and Wu C. (1999). Biochem. Biophys. Acta., 1489, 452 ± 456. von Knebel Doeberitz M, Oltersdorf T, Schwarz E and Gissmann L. (1988). Cancer Res., 48, 3780 ± 3786. von Knebel Doeberitz M, Bauknecht T, Bartsch D and zur Hausen H. (1991). Proc. Natl. Acad. Sci. USA, 88, 1411 ± 1415. Walboomers JM, Jacobs MV, Manos MM, Bosch FX, Kummer JA, Shah KV, Snijders PJ, Peto J, Meijer CJ and Munoz N. (1999). J. Pathol., 189, 12 ± 19. zur Hausen H. (1999). Semin. Cancer Biol., 9, 405 ± 411.