Applications of microarrays to histopathology. M van de Rijn & C B Gilks1.
Department of Pathology, Stanford University Medical Center, Stanford, CA, USA,
and ...
Histopathology 2004, 44, 97–108
REVIEW
Applications of microarrays to histopathology M van de Rijn & C B Gilks1 Department of Pathology, Stanford University Medical Center, Stanford, CA, USA, and 1Genetic Pathology Evaluation Center of the Department of Pathology and Prostate Centre, Vancouver General Hospital, British Columbia Cancer Agency and University of British Columbia, Vancouver, BC, Canada
van de Rijn M & Gilks C B (2004) Histopathology 44, 97–108
Applications of microarrays to histopathology High-throughput microarray technologies have the potential to impact significantly on the practice of histopathology over the coming years. Global gene expression profiling allows for a systematic search of all human genes for novel diagnostic and prognostic markers and for potential therapeutic targets. Likewise, gene copy number changes can be determined on a gene-by-gene basis using microarrays. Tissue microarrays are an efficient method to extend and validate
the findings obtained from the initial ‘discovery’ phase of the research, done using cDNA microarrays. In addition, tissue microarrays can be used for quality assurance for immunohistochemical and in situ hybridization procedures. In this review we give a brief overview of microarray technology and research uses, and discuss potential applications of microarrays in the practice of diagnostic histopathology.
Keywords: gene arrays, in situ hybridization, tissue microarrays
Technology Gene microarray technology rests on the ability to deposit many (tens of thousands) different DNA sequences on a small surface, usually a glass slide (often referred to as a ‘chip’). The different DNA fragments are arranged in rows and columns such that the identity of each fragment is known through its location on the array. A variety of methods exist to deposit these DNA sequences on the chip and these are discussed in detail elsewhere.1–3 Once synthesized, these chips can be used to measure the mRNA expression levels for tens of thousands of genes from a tissue sample by hybridizing fluorescently labelled cDNA from that tissue to the chip (Figure 1a). The ability to measure, in principle, the expression of all human genes in a single experiment is an enormous increase over the previously available techniques for measuring gene expression [Northern blot, reverse Address for correspondence: M van de Rijn MD, Department of Pathology, Stanford University Medical Center, 300 Pasteur Drive, Stanford, CA 94305, USA. e-mail:
[email protected] 2004 Blackwell Publishing Limited.
transcriptase-polymerase chain reaction (RT-PCR)], where only a few genes could be studied per experiment. Such ‘global expression profiling’ not only looks at orders of magnitude more genes than was possible previously, but also has the advantage that the genes examined are not influenced by preselection of genes. An analogy would be looking for microorganisms by PCR compared with streaking a sample on a culture plate; the former will only identify organisms specifically sought with the primer sets used, while the second approach is restricted only by the organism’s ability to grow under the conditions chosen. The study of the expression of most, if not all, genes in a specimen is not hypothesis-driven as most research used to be,1 but is instead referred to as ‘discovery-type research’ or, in a less flattering description, as ‘fishing expeditions’. Whereas cDNA derived from a tumour is hybridized to a chip to study gene expression levels, alterations in DNA copy number (gene amplification or deletion) can be measured by hybridizing fluorescently labelled DNA from a tumour specimen to these chips.4,5 This technique is also applicable to DNA isolated from formalin-fixed paraffin-embedded material6 and
M van de Rijn & C B Gilks
98
a
b
Gene arrays
Tissue arrays
One sample, many markers
Many samples, one marker
Gene expression
Antibodies
Gene amplification/deletion
In situ hybridization
Figure 2. Gene microarrays and tissue microarrays (TMAs) are complementary in the features that can be examined. While gene microarrays are used to test the level of expression for many thousands of genes in relatively few samples, TMAs are used to test the staining pattern of relatively few markers on a large number of samples. In addition, TMAs can give much more information about the specific site of gene expression as they can distinguish, for example, between protein expressed in tumour cells versus that in surrounding connective tissue cells. Likewise they can distinguish predominantly nuclear localization of protein from membrane localization, an important consideration when searching for novel cell membrane-associated therapeutic targets.
Sections cut from TMA blocks can then be used for immunohistochemistry (IHC) or in situ hybridization studies (Figure 1b). TMAs are similar to gene expression microarrays in having samples arrayed in rows and columns on a glass slide; they differ in that each element on the TMA slide corresponds to a single patient sample, allowing multiple patient samples to be assessed for a single molecular marker in one experiment, while gene expression arrays allow assessment of thousands of molecular markers on a single patient sample per experiment (Figure 2).
Figure 1. a, Representative image of a 42 000 genespot cDNA microarray hybridized with green fluorescently labelled reference cDNA and red fluorescently labelled sample cDNA. Each spot measures a fraction of a millimetre and represents an individual unique gene. The degree of redness of each spot reflects the relative amount of mRNA for that particular gene in the sample tested. b, Part of section of a 284-core lymphoma tissue microarray19 stained with Bcl-2 antibody. Each core has a cross-sectional diameter of 0.6 mm.
Bioinformatics
has the advantage that it can be used to study specimens stored in surgical pathology archives. To date, it has not proven practical to extract usable mRNA reliably for gene expression profiling from archival tissue blocks. While gene arrays can generate huge amounts of data, the technology remains complex, the arrays are expensive, and, for gene expression studies, require fresh frozen material. Tissue microarrays (TMAs) are constructed by transferring cores of paraffin-embedded tissue to precored holes in a recipient paraffin block.7 Over 500 cores can be placed in a single block by this technique.
The amount of data generated by studies using gene microarrays is astonishing. For example, by studying 40 tumours on gene arrays with 40 000 gene spots, 1.6 million data points are obtained. On a somewhat smaller scale, a TMA with 400 tumours on which 20 stains are performed will yield 8000 immunostain results. It is clear that such enormous numbers of data points require novel analytical methods. The management and utilization of these data represent arguably the single biggest challenge to histopathologists in maximizing the utility of these technologies. For gene microarray analysis two broad categories of analysis exist, specifically supervised analysis in which 2004 Blackwell Publishing Ltd, Histopathology, 44, 97–108.
Applications of microarrays to histopathology
directed questions are asked of the dataset,8–10 and unsupervised analysis in which the dataset is organized based on similarities in gene expression levels in the different tumours, without consideration of information other than gene expression levels, such as clinical or pathological data.11 Both types of analysis have their advantages and disadvantages and many researchers apply both when examining the large datasets generated by gene array studies. An example of unsupervised analysis is shown in Figure 3. A number of software packages have been developed to aid in these studies and many are available on the web (e.g. http:// rana.lbl.gov/EisenSoftware.htm). It has become standard practice for research groups to make the entire
99
dataset on which a publication is based available on the worldwide web. In fact, many journals now require this for publication of array datasets. In this manner other researchers can use their own analytical methods to mine multiple datasets.12 Examples of such websites are: http://genome-http://www.stanford.edu/ MicroArray, http://www-genome.wi.mit.edu/cgi-bin/ cancer/publications/pub_menu.cgi, and http://www. rii.com/publications/default.htm, where datasets can be downloaded and ⁄ or clustered through a userfriendly interface. Given the fact that the datasets can be of staggering size and that multiple different analytical techniques exist, it is not surprising that at first glance
3450 genes
63 STT
cadherin 3 keratin 18 keratin 8 keratin 19 keratin 8 keratin 8 claudin 7 keratin 7 keratin 6B keratin 17
Figure 3. An example of frequently used representation of gene array data, using unsupervised hierarchical clustering. Tumours are arranged in columns and have been grouped together based on their degree of shared gene expression levels. The dendrogram above the tumours reflects the level of relatedness between the tumours with short branches denoting a high degree of similarity. The genes are arranged in rows and are also grouped together based on their shared similarity in expression across the tumours. The overlap between a tumour (columns) and a gene (rows) is coloured red when there is a relatively high level of mRNA expression for that gene in the tumour, and green when there is a relatively low level of expression. As shown, myxoid chondrosarcomas are much more related to each other than to other tumours, based primarily on the expression of a large group of genes including keratin genes (Subramanian et al., unpublished data). 2004 Blackwell Publishing Ltd, Histopathology, 44, 97–108.
100
M van de Rijn & C B Gilks
publications from two different groups on the same tumour may not identify the same sets of genes as being of (for example) prognostic significance. It should be kept in mind that most reports that discuss gene microarray findings show one interpretation of a large dataset. The results of this interpretation may vary somewhat depending on a number of factors. These include the type of arrays used, the genes represented on them, and the analysis techniques used. Significant differences can be found when supervised versus unsupervised methods of analysis are used. In particular, the type of question that is asked of the dataset can lead to significantly different gene lists. An example of this can be seen in the data from two different groups studying breast carcinoma. In one study, supervised analysis was used to determine which set of genes would predict outcome in breast carcinoma patients.9 In a study from another group, unsupervised clustering was used to determine the relatedness between different cases of breast carcinoma within that group, an example of unsupervised analysis; as a result, the presence of a previously not recognized subtype of breast carcinoma that had a poor clinical outcome was discovered.13 At first glance these two approaches showed very little overlap in the genes identified as being significant in classification. A recent report demonstrated, however, that when the same analytical approach was applied to both datasets, very significant overlap existed.14 Other variations in the data present in reports are the results of differences in the degree of variation in expression required from the genes for inclusion in the final published dataset (referred to as data filtering). It is for this reason that the entire unfiltered dataset of array studies should be made available to the research community, so that comparisons between the results from different groups can be made.15 Another reason for publishing the entire dataset of gene expression profiling experiments lies in the fact that few if any of the laboratories working in this field have the ability to follow up on all the leads from these experiments. By publishing the entire dataset and not just those data described in the manuscript an enormous amount of data has already been made public on the worldwide web. These datasets present an excellent opportunity for members of the surgical pathology community to search for new potential markers for specific diseases. Because they do not require fresh frozen clinical samples, TMAs allow for a rapid validation and extension of the gene expression studies on many
more samples than are generally available for gene microarray studies. Since archival samples are used, TMA data can also be linked to long-term patient outcome data. With cores from many hundreds of tumours represented on a single TMA, they can generate an enormous amount of data. The data are complex, since the TMA sections yield information on staining intensity and staining distribution in the tissue specimen, e.g. nuclear versus cytoplasmic staining, stromal versus tumour cell staining, etc. In contrast, gene microarray data consist of objectively measured fluorescence levels in spots that contain a single gene. Thus, unlike gene expression studies, the scoring of TMA IHC experiments remains subjective and not easily quantifiable. Significant efforts are being made to quantify protein expression levels on TMA sections,16,17 but for the foreseeable future most TMAs will be subjectively scored by eye. It is essential therefore that TMA data can be revisited frequently and in a rapid manner. Doing this on the actual glass slides is a laborious process. Digital imaging of TMA sections stained with a variety of antibodies can help, but this is, in essence, not much more useful than examining the actual sections of the TMA directly under the microscope. In both cases it can be cumbersome to find the co-ordinates of a core from a specific sample. This becomes especially problematic when different stains performed on the same core have to be compared. When evaluating a variety of stains on the same core within a TMA, both approaches will be time consuming, as the core in question will need to be identified for each section stained. It is for this reason that we and others have developed systems that allow the retrieval of images of sections of the same tissue core stained with multiple different stains from a library of digital images scanned in through a computerized microscope.18,19 An example of this approach is shown in Figure 4. The availability of digital images from all stained cores also allows for the publication of all (up to thousands) immunostain results through websites affiliated with the immunohistochemical study.20 This contrasts with earlier studies that not only were performed on many fewer samples but also could only show one or two examples of immunostaining results, with best rather than representative results often shown. Data mining of TMA data is more difficult than for gene expression data, because of the subjective nature of TMA scoring. A proposal has been put forward, nonetheless, for a standardized data format for presentation of TMA data, to facilitate access to published datasets (http://www.biomedcentral.com/ 1472-6947/3/5). 2004 Blackwell Publishing Ltd, Histopathology, 44, 97–108.
Applications of microarrays to histopathology
101
Digital image of one section does not allow comparison of stains on one core
Section based review TMA H&E B A
Bcl-2 Ki-67 S-100 600 µm
C Bcl-2 staining Core based review
Combined digital images of many sections from same core
E D Click on tumour
Figure 4. Software and digital images aid in the evaluation of tissue microarray (TMA) data. TMA paraffin block containing several hundred 0.6-mm diameter cores. From this block multiple sections are cut, each of which will be used in an immunohistochemical stain or an in situ hybridization reaction. A, Sections from a TMA can be examined under the microscope, but finding the same core on sections stained with different antibodies is cumbersome. B, Likewise, finding the correct cores on digital images taken from the same slide is tedious and the files tend to be large. C, These problems are circumvented by scoring TMA slides on Excel worksheets; multiple stains on the same TMA are then combined in a workbook that can be ‘Deconvoluted’ to generate a precluster file. Next the data are clustered and rendered in ‘Treeview’ in a format similar to that used for gene arrays but now with tumours in rows and antibody or in situ hybridization stains in columns.19 D, By clicking on a tumour on the computer screen, the investigator is now linked to a server that contains tens of thousands of relatively small digital images from individual cores (not images from the entire TMA slide) on the TMA. E, The images for the selected tumour can then be shown on screen, generating a ‘virtual core’ consisting of all sections stained from that particular sample. The images can be opened further in a much higher resolution than shown here, allowing for comparison between different staining reactions on the same tissue core.19
Gene expression profiling It has been suggested that gene expression profiling could replace histopathological examination of tissue samples.21 We do not think so. The reason is that histopathology is inexpensive and highly effective in prediction of behaviour and response to therapy. Moreover, it seems likely that much of the prognostic information in a tumour is already adequately defined through determination of tumour grade, size, presence of necrosis, etc. As well, histopathological examination is required to assess resection margins, a significant consideration for any tumour where complete surgical excision is important in therapy. Nevertheless, we 2004 Blackwell Publishing Ltd, Histopathology, 44, 97–108.
believe that many significant opportunities exist for application of gene expression profiling to improve patient care. First, there is a finite amount of information available in a histological section of tumour, and we believe that we are reaching the limits of classification based purely on morphology. This creates an opportunity to use gene expression profiling as a diagnostic adjunct to improve the diagnostic subclassification of tumours; in the case of lymphomas, in particular, we have seen the importance of flow cytometry, IHC, cytogenetics and molecular diagnostics in improving diagnostic accuracy over that afforded by light microscopy alone. The significance of finding novel subclassifications of
102
M van de Rijn & C B Gilks
existing tumour types lies, of course, in the possibility that previously unrecognized subsets of tumours may show an increased response to certain therapies. Gene expression profiling has already been shown to be able to provide prognostically significant molecular subclassification of breast carcinomas,8,10,13,14,22 lymphomas,23–26 lung carcinomas,27,28 gliomas,29 and medulloblastomas30 beyond what is possible based on conventional histopathological examination, and this list will continue to grow in the years to come. Second, gene expression profiling could be applied to cases that are diagnostically challenging using current techniques, e.g. small blue cell tumours,31 soft tissue sarcomas,32–34 carcinomas of unknown primary site. The experience with this type of application is very preliminary indeed. In a study of dermatofibrosarcoma protuberans (DFSP), we were able to distinguish examples of DFSP from cellular fibrous histiocytoma and myxofibrosarcoma based on their gene expression profiles,6 but more experience, including correlation with clinical outcome, is required to demonstrate that gene expression profiling is a significant improvement over the histological evaluation of these cases. Third, gene expression profiling can be used to discover new markers that might predict a response to targeted therapy. Examples of existing markers are oestrogen receptor (ER) and HER2 in breast carcinoma and kit in gastrointestinal stromal tumour. Recently, we found that epidermal growth factor receptor (EGFR) was expressed at higher levels in synovial sarcoma than in a variety of other sarcomas.20,32 Partly as a result of this finding a clinical trial is now being performed at the European Organization for the Research and Treatment of Cancer using an EGFR inhibitory drug. All cell surface markers that show relative specificity for a particular tumour are potential targets for therapy, either by using antibody-mediated drug delivery or through small molecule inhibitors of functional proteins. Is it possible that gene arrays themselves may one day find application in surgical pathology laboratories? It seems likely that with increasing simplification of RNA isolation techniques, reverse transcription methods, etc., the use of gene arrays within diagnostic laboratories may have some future applications. However, there are two significant impediments to the introduction of gene expression profiling into routine diagnostic practice, even if all the quality assurance and reproducibility issues could be addressed. First, the subgroups identified based on gene expression profiling can potentially be identified by techniques already widely in use, including IHC and in situ hybridization. If
equivalent information can be obtained through these methods, especially if they are less expensive than gene expression profiling, as seems likely, it will remain a tool of discovery rather than a tool of routine diagnosis. As an example, the ‘basal’ phenotype of breast carcinoma, identified by gene expression profiling as being a marker of worse prognosis,13,14 can also be identified by immunostaining for cytokeratins 5 ⁄ 6 and 17.35 Second, the ability to offer prognostic information that does not influence patient management, even highly statistically significant prognostic information, is of limited clinical impact. Although the difference between 40% and 60% survival at 5 years is highly significant, the impact on an individual patient, unless it influences treatment decisions, is unlikely to be sufficient to justify the cost of obtaining that information by gene expression profiling.
Array-based comparative genomic hybridization Gene microarrays have been used for comparative genomic hybridization. In this technique, genomic DNA is fluorescently labelled and used to determine the presence of gene loss or amplification.4,5,36,37 Array-based comparative genomic hybridization (aCGH) has been used to map genetic abnormalities in a wide range of tumours, including breast carcinoma,37 DFSP,6 bladder carcinoma,38 fallopian tube carcinoma,39 gastric carcinoma,40 melanoma,41 and lymphoma,42 to name a few. aCGH has also been successfully applied to high-resolution molecular karyotyping of patients with genetic disorders.41,43 Unlike gene expression array analysis, for which fresh frozen tissue is required, aCGH can also be performed on material isolated from formalin-fixed paraffinembedded samples.6,44 Thus the large collections of clinical samples stored in the archives of histopathology laboratories are suitable for research. The importance of this aspect is that many samples with existing clinical follow-up and many samples from defined clinical trial protocols are available for study. The latter samples are particularly valuable, as samples from randomized clinical trials allow testing of the influence of specific gene amplification events on response to the therapy. A variety of techniques for aCGH are being used, and many researchers use arrays that are made from bacterial artificial chromosome (BAC) fragments. At Stanford the same cDNA gene arrays used for gene expression are utilized for aCGH. This allows for a direct comparison between gene expression profiles and genomic copy number alterations,6,36 and it becomes 2004 Blackwell Publishing Ltd, Histopathology, 44, 97–108.
Applications of microarrays to histopathology
clear from these studies that a significant percentage of genes that are highly expressed at the mRNA level also show an amplification of the genes at the DNA level. Several potential clinical applications for aCGH exist. For example, in pilot experiments we have shown that classical cases of melanoma and Spitz naevus can be distinguished using cDNA gene arrays [45]. These findings are in agreement with results of previous studies employing conventional CGH or fluorescence in situ hybridization (FISH) techniques.46–48 Of course, these studies need to be extended to include many more samples and should evaluate the utility of these techniques on samples that are ambiguous on histological examination, comparing histological examination versus aCGH in predicting behaviour. In the case of melanocytic lesions, the ability to use paraffinembedded material is especially important as the entire lesion is usually submitted for histology, with no frozen material left for additional studies. t is s u e mi c r o a rr a ys TMAs have different but complementary applications, compared with gene expression profiling or aCGH. Immunohistochemical studies on TMAs have been used for validation of gene expression profiling results,49 documenting that in most instances increased mRNA expression correlates with increased protein levels. This is not invariably the case; for example a1-antitrypsin is expressed at very high levels in hepatocytes but rapidly exported so that there is relatively little intracellular protein,50 and caspase-3 protein is undetectable in skeletal muscle despite the presence of abundant mRNA transcripts, through posttranscriptional regulation of protein synthesis.51 In a recent study, Ginestier et al. showed that there can be significant differences between RNA expression levels as measured by hybridization to nylon membranes and protein levels as measured by IHC.52 Identification of molecules of interest by gene expression profiling can be rapidly validated by IHC only if there are existing antibodies against the gene product in question. After demonstration of a relatively increased mRNA expression of the EGFR in synovial sarcoma,32 we could confirm the presence of high protein levels for this molecule using an existing antibody on a TMA containing over 40 synovial sarcomas.20 For individual genes identified as being of importance for diagnosis, prognostication, or prediction of response to therapy by gene expression profiling, TMAs offer the most direct approach to validation. This is also true for gene amplification events identified by aCGH; validation of the importance of specific amplicons can be rapidly 2004 Blackwell Publishing Ltd, Histopathology, 44, 97–108.
103
accomplished through FISH or chromogenic ISH analysis of TMAs. For ISH to detect gene copy number changes there is not even the impediment of needing an antibody, as for any gene on a cDNA microarray it is theoretically possible to generate rapidly a suitable probe for ISH. Because IHC is routinely used in diagnostic histopathology laboratories and can be applied to formalinfixed paraffin-embedded tissues, it is probable that after discovery of individual genes of interest through gene expression profiling, the evolution to clinical practice will lead first to introduction of new IHC assays, rather than introduction of cDNA microarray technology for assessment of these genes. Recently modifications of previous mRNA ISH protocols53,54 have allowed application of this technique to formalin-fixed paraffinembedded material (Figure 5). It is to be expected that in the future many of the applications of TMAs will involve mRNA ISH, as synthesis of ISH probes can be performed much more rapidly than generation of antipeptide antisera or monoclonal antibodies. As discussed previously, gene expression data can identify groups of cases with significantly different outcomes, where routine histopathological examination does not permit subclassification. Although expression levels of large numbers of genes are used, the ultimate cluster grouping can often be reproduced using a much smaller number of genes (50 or less). Hierarchical clustering analysis can be applied to immunohistochemical data55 and it is possible that by using a panel of antibodies, the cluster groupings derived from gene expression data can be reproduced based on IHC staining results. Although there have not been complete reproductions of the gene expression experiments using IHC, primarily because the full complement of necessary antibodies is not available, clustering of cases of lung, endometrial and breast carcinoma based on IHC staining of TMAs has been done and has, for endometrial and breast carcinoma, identified groups with differing prognoses (Gilks CB, unpublished data) (Figure 6). Whether application of panels of immunomarkers and interpretation of results using clustering to define molecular subgroups of tumours will prove more powerful than the traditional approach of using single markers remains to be seen. Tissue microarrays composed of numerous different tissues and tumour types, referred to as ‘prevalence TMAs’,56 afford the opportunity to establish very rapidly the prevalence of expression of a given protein.57–59 This ability to determine the sensitivity and specificity of a new immunostain for a given diagnosis should revolutionize the introduction of new antibodies
104
M van de Rijn & C B Gilks
a
b
1_35_10_4_24_1048_... 1_35_10_5_24_1048_... 1_35_10_6_24_1048_...
1_35_10_4_17_284_4... 1_35_10_5_17_284_6... 1_35_10_6_17_284_7...
1_35_11_3_24_1048_... 1_35_11_4_24_1048_... 1_35_11_5_24_1048_...
1_35_11_3_17_284_3... 1_35_11_4_17_284_4... 1_35_11_5_17_284_6...
1_35_12_2_24_1048_... 1_35_12_3_24_1048_... 1_35_12_4_24_1048_...
1_35_12_2_17_284_1... 1_35_12_3_17_284_4... 1_35_12_4_17_284_4...
c
d
1_35_10_4_109_1048... 1_35_10_5_109_1048... 1_35_10_6_109_1048...
1_35_10_4_106_284_... 1_35_10_5_106_284_... 1_35_10_6_106_284_...
1_35_11_3_109_1048... 1_35_11_4_109_1048... 1_35_11_5_109_1048...
1_35_11_3_106_284_... 1_35_11_4_106_284_... 1_35_11_5_106_284_...
1_35_12_2_109_1048... 1_35_12_3_109_1048... 1_35_12_4_109_1048...
1_35_12_2_106_284_... 1_35_12_3_106_284_... 1_35_12_4_106_284_...
Figure 5. Comparison between in situ hybridization (ISH) and immunohistochemistry (IHC) on paraffin-embedded formalin-fixed material. Low-power section of nine cores on a sarcoma tissue microarray [69] stained with IHC and ISH. a, CD117 IHC highlighting gastrointestinal stromal tumour (GIST) cases. b, DOG1, a novel GIST marker (West RB et al., unpublished data) stained by anitserum raised against synthetic peptides derived from gene identified on gene expression studies. c, ISH for CD117 mRNA. d, ISH for DOG1 mRNA.
into the diagnostic IHC laboratory. For example, CD34 was initially described as being a useful adjunct in the diagnosis of vascular tumours60 but subsequently was
found to be a marker of solitary fibrous tumour.61 This was followed by a number of publications reporting the presence of CD34 in a wide variety of soft-tissue 2004 Blackwell Publishing Ltd, Histopathology, 44, 97–108.
Applications of microarrays to histopathology
a
21 antibodies
405 breast carcinomas
1
2
3
b
1.0
Disease-specific survival
0.9 0.8 0.7
1
0.6 3
0.5 0.4 0.3
2
0.2 0.1 0.0 0
5
10 15 20 25 Total follow-up (years) P > 0.001
30
Figure 6. Hierarchical clustering analysis of 405 cases of breast carcinoma, based on immunostaining results with 21 antibodies. The intersection between a column (antibody) and row (tumour) is bright red when the tumour cells are strongly immunoreactive, dark red or black when the tumour shows intermediate or weak immunoreactivity, respectively, and green when negative. White indicates missing data. a, Three different cluster groups (1, 2, and 3) are identified based on unsupervised hierarchical cluster analysis. b, The diseasespecific survival of the patients in these three groups is significantly different (P < 0.0001). 2004 Blackwell Publishing Ltd, Histopathology, 44, 97–108.
105
tumours.62–64 As a result, some referred to CD34 as ‘the vimentin of the nineties’ (R. Kempson, personal communication). The advantage of being able to stain a large number of cases using TMAs, at minimal cost, means that the staining profile for new antibodies will be more rapidly established and will not be determined over a period of years. TMAs have immediate advantages for quality assurance in diagnostic IHC or ISH laboratories. Because each laboratory may use different protocols for fixation and tissue processing, the applicability of published staining results obtained in other laboratories is questionable. All new antibodies introduced into diagnostic use should undergo a trial with a number of cases processed ‘in house’ to ensure that the profile of cases stained is as expected. For reasons of cost, this is typically a limited number of cases. In contrast, use of a TMA composed of numerous tissue and tumour types processed in that laboratory allows a rapid assessment of the staining profile of a new antibody on material from that laboratory.59,65 This same TMA could then be used to test new aliquots of antibody or for periodic assessment of staining to ensure that there had not been a change in the immunoreactivity of cases recognized by a given antibody over time. TMAs also offer advantages for testing interlaboratory and interobserver variation in staining and interpretation of staining results, respectively.59,66,67 Instead of a single case, we used a small TMA with 29 cases of breast carcinoma to compare results of staining for ER in different laboratories. The advantage was that instead of using just a single strongly ER+ case, we were able to present a set of cases showing a spectrum of ER positivity; using this array we were able to determine that there was decreased sensitivity in detection of weak ER+ cases in one of the participating laboratories.66 We were also able to ascertain that the interpretation of the staining was not the issue, but that it was related to technical factors. This would not have been possible based on a section of a single tumour. Von Wasielewski et al. in a much larger study involving 172 laboratories also documented significant problems in the reproducibility of recognition of weak ER positivity, and were able to trace the technical basis of poor sensitivity to insufficient antigen retrieval.67 Using the digital imaging software previously described (Figure 4), it would be possible to have a TMA-based IHC quality assurance programme where participants could, via the internet, compare their staining results on the same tissue samples with those from other labs. A final issue for consideration is whether TMAs can replace whole sections for diagnostic work. This is particularly relevant for breast carcinoma where ER,
106
M van de Rijn & C B Gilks
Immediate 1
TMAs for characterization of new monoclonal antibodies for IHC and periodic testing of currently used antibodies for sensitivity and specificity of staining
2
TMAs for interlaboratory quality assurance programmes for IHC and ISH
Table 1. Microarray applications in routine diagnostic pathology
Future 1
TMAs in lieu of whole sections for adjuvant testing (e.g. ER, PR, HER2)
2
Gene expression profiling for differential diagnosis (e.g. small blue cell tumours, carcinoma of unknown primary site)
3
Gene expression profiling for molecular subclassification of tumours (e.g. breast carcinoma, lung carcinoma, lymphoma)
4
aCGH for differential diagnosis (e.g. Spitz naevus versus melanoma)
5
Gene expression profiling and ⁄ or aCGH for identification of molecular therapeutic targets, with the goal of achieving individualized therapy
TMA, Tissue microarray; IHC, immunohistochemistry; ISH, in situ hybridization; ER, oestrogen receptor; PR, progesterone receptor; aCGH, array-based comparative genomic hybridization.
progesterone receptor (PR) and HER2 are routinely assessed on every case. By batching cases onto TMAs, there would be significant savings in reagent and technical costs. This does beg the question of whether cores of tumour are representative of whole sections. Numerous studies have validated the correlation between immunostaining of one to four 0.6-mm cores and staining whole sections (reviewed by Simon et al.56). Intriguingly, Torhorst et al. found that the prognostic significance of immunostaining for p53, ER and PR in a large series of breast carcinomas, based on staining of a single 0.6 mm core, was as good as or better than results obtained using whole sections.68 This strongly suggests that the use of small cores can be equivalent to whole section immunohistochemistry and may, in fact, prove superior, presumably by exclusion of cases with weak, focal staining. Further studies are needed, specifically looking at the cases that are discordant by TMA versus whole section IHC; if these cases behave like the IHC-negative group, that would support TMAs as being superior to whole sections, while if they behave like the IHC-positive group, it would indicate that the use of small cores results in false-negative results.
Summary Potential applications of microarray technology in diagnostic histopathology are summarized in Table 1. There are some immediate applications for TMAs in the
diagnostic laboratory, while gene expression profiling and aCGH face a significant developmental period before they can be routinely used.
Acknowledgements We express our appreciation to members of our laboratories for assistance in data generation, production of figures, and critical review of this manuscript.
References 1. Brown PO, Botstein D. Exploring the new world of the genome with DNA microarrays. Nat. Genet. 1999; 21; 33–37. 2. Granjeaud S, Bertucci F, Jordan BR. Expression profiling: DNA arrays in many guises. Bioessays 1999; 21; 781–790. 3. Lockhart DJ, Winzeler EA. Genomics, gene expression and DNA arrays. Nature 2000; 15; 827–836. 4. Pinkel D, Segraves R, Sudar D et al. High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays. Nat. Genet. 1998; 20; 207–211. 5. Pollack JR, Perou CM, Alizadeh AA et al. Genome-wide analysis of DNA copy-number changes using cDNA microarrays. Nat. Genet. 1999; 23; 41–46. 6. Linn SC, West RB, Pollack JR et al. Gene expression patterns and gene copy number changes in dermatofibrosarcoma protuberans. Am. J. Pathol. 2003; 163; 2383–2395. 7. Kononen J, Bubendorf L, Kallioniemi A et al. Tissue microarrays for high-throughput molecular profiling of tumor specimens. Nat. Med. 1998; 4; 844–847. 8. West M, Blanchette C, Dressman H et al. Predicting the clinical status of human breast cancer by using gene expression profiles. Proc. Natl. Acad. Sci. USA 2001; 98; 11462–11467. 2004 Blackwell Publishing Ltd, Histopathology, 44, 97–108.
Applications of microarrays to histopathology
9. Tusher VG, Tibshirani R., Chu G. Significance analysis of microarrays applied to the ionising radiation response. Proc. Natl. Acad. Sci. USA 2001; 98; 5116–5121. 10. van ‘t Veer LJ, Dai H, van de Vijver MJ et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature 2002; 415; 530–536. 11. Eisen MB, Spellman PT, Brown PO, Botstein D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 1998; 95; 14863–14868. 12. Rhodes DR, Barrette TR, Rubin MA, Ghosh D, Chinnaiyian AM. Meta-analysis of microarrays: interstudy validation of gene expression profiles reveals pathway dysregulation in prostate cancer. Cancer Res. 2002; 62; 4427–4433. 13. Sorlie T, Perou CM, Tibshirani R et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc. Natl. Acad. Sci. USA 2001; 98; 10869– 10874. 14. Sorlie T, Tibshirani R, Parker J et al. Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc. Natl. Acad. Sci. USA 2003; 100; 8418–8423. 15. Perou CM. Show me the data! Nat. Genet. 2001; 29; 373. 16. Camp RL, Chung GG, Rimm DL. Automated subcellular localization and quantification of protein expression in tissue microarrays. Nat. Med. 2002; 8; 1323–1327. 17. Camp RL, Dolled-Filhart M, King BL, Rimm DL. Quantitative analysis of breast cancer tissue microarrays shows that both high and normal levels of HER2 expression are associated with poor outcome. Cancer Res. 2003; 63; 1445–1448. 18. Manley S, Mucci NR, De Marzo AM, Rubin MA. Relational database structure to manage high-density tissue microarray data and images for pathology studies focusing on clinical outcomes. Am. J. Pathol. 2001; 159; 837–843. 19. Liu CL, Prapong W, Natkunam Y et al. Software tools for highthroughput analysis and archiving of immunohistochemistry staining data obtained with tissue microarrays. Am. J. Pathol. 2002; 161; 1557–1565. 20. Nielsen TO, Hsu FD, O’Connell JX et al. Tissue microarray analysis of novel synovial sarcoma immunohistochemical markers. Am. J. Pathol. 2003; 163; 1449–1456. 21. Aparicio SA, Caldas C, Ponder B. Does massively parallel transcriptome analysis signify the end of cancer histopathology as we know it? Genome Biol. 2000; 1; 1021-1–1021-3. 22. van de Vijver MJ, He YD, van’t Veer LJ et al. A gene-expression signature as a predictor of survival in breast cancer. N. Engl. J. Med. 2002; 347; 1999–2009. 23. Alizadeh AA, Eisen MB, Davis RE et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 2000; 403; 503–511. 24. Shipp MA, Ross KN, Tamayo P et al. Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat. Med. 2002; 8; 68–74. 25. Rosenwald A, Wright G, Chan WC et al. The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. N. Engl. J. Med. 2002; 346; 1937–47. 26. Rosenwald A, Wright G, Wiestner A et al. The proliferation gene expression signature is a quantitative integrator of oncogenic events that predicts survival in mantle cell lymphoma. Cancer Cell 2003; 3; 185–197. 27. Bhattacharjee A, Richards WG, Staunton J et al. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc. Natl. Acad. Sci. USA 2001; 98; 13790–13795. 2004 Blackwell Publishing Ltd, Histopathology, 44, 97–108.
107
28. Garber ME, Troyanskaya OG, Schluens K et al. Diversity of gene expression in adenocarcinoma of the lung. Proc. Natl. Acad. Sci. USA 2001; 98; 13784–13789. 29. Nutt CL, Mani DR, Betensky RA et al. Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. Cancer Res. 2003; 63; 1602– 1607. 30. Pomeroy SL, Tamayo P, Gaasenbeek M et al. Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 2002; 415; 436–442. 31. Khan J, Wei JS, Ringner M et al. Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat. Med. 2001; 7; 673–679. 32. Nielsen TO, West RB, Linn SC et al. Molecular characterisation of soft tissue tumours: a gene expression study. Lancet 2002; 359; 1301–1307. 33. Lee YF, John M, Edwards S et al. Molecular classification of synovial sarcomas, leiomyosarcomas and malignant fibrous histiocytomas by gene expression profiling. Br. J. Cancer 2003; 88; 510–515. 34. Segal NH, Pavlidids P, Antonescu CR et al. Classification and subtype prediction of adult soft tissue sarcoma by functional genomics. Am. J. Pathol. 2003; 163; 691–700. 35. van de Rijn M, Perou CM, Tibshirani R et al. Expression of cytokeratins 17 and 5 identifies a group of breast carcinomas with poor clinical outcome. Am. J. Pathol. 2002; 161; 1991– 1996. 36. Solinas-Toldo S, Lampel S, Stilgenbauer S et al. Matrix-based comparative genomic hybridization: biochips to screen for genomic imbalances. Genes Chromosomes Cancer 1997; 20; 399–407. 37. Pollack JR, Sorlie T, Perou CM et al. Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors. Proc. Natl. Acad. Sci. USA 2002; 99; 12936–12968. 38. Veltman JA, Fridlyand J, Pejavar S et al. Array-based comparative genomic hybridization for genome-wide screening of DNA copy number in bladder tumors. Cancer Res. 2003; 63; 2872– 2880. 39. Snijders AM, Nowee ME, Fridlyand J et al. Genome-wide arraybased comparative genomic hybridization reveals genetic homogeneity and frequent copy number increases encompassing CCNE1 in fallopian tube carcinoma. Oncogene 2003; 22; 4281– 4286. 40. Weiss MM, Snijders AM, Kuipers EJ et al. Determination of amplicon boundaries at 20q13.2 in tissue samples of human gastric adenocarcinomas by high-resolution microarray comparative genomic hybridization. J. Pathol. 2003; 200; 320–326. 41. Buckley PG, Mantripragada KK, Benetkiewicz M et al. A fullcoverage, high-resolution human chromosome 22 genomic microarray for clinical and research applications. Hum. Mol. Genet. 2002; 11; 3221–3229. 42. Martinez-Climent JA, Alizadeh AA, Segraves R et al. Transformation of follicular lymphoma to diffuse large cell lymphoma is associated with a heterogeneous set of DNA copy number and gene expression alterations. Blood 2003; 101; 3109–3117. 43. Veltman JA, Jonkers Y, Nuijten I et al. Definition of a critical region on chromosome 18 for congenital aural atresia by array CGH. Am. J. Hum. Genet. 2003; 72; 1578–1584. 44. Paris PL, Albertson DG, Alers JC et al. High-resolution analysis of paraffin-embedded and formalin-fixed prostate tumors using comparative genomic hybridization to genomic microarrays. Am. J. Pathol. 2003; 162; 763–770.
108
M van de Rijn & C B Gilks
45. Harvell JD, Kohler S, Zhu S, Hernandez-Boussard T, Pollack JR, van de Rijn M. High resolution array-based comparative genomic hybridization for distinguishing paraffin-embedded Spitz nevi and melanomas. Diagnostic Molecular Pathology 2004; in press. 46. Bastian BC, Wesselmann U, Pinkel D, Leboit PE. Molecular cytogenetic analysis of Spitz nevi shows clear differences to melanoma. J. Invest. Dermatol. 1999; 113; 1065–1069. 47. Mihic-Probst D, Zhao J, Saremaslani P, Baer A, Komminoth P, Heitz PU. Spitzoid malignant melanoma with lymph-node metastasis. Is a copy-number loss on chromosome 6q a marker of malignancy? Virchows Arch. 2001; 439; 823–826. 48. Harvell JD, Bastian BC, Leboit PE. Persistent (recurrent) Spitz nevi: a histopathologic, immunohistochemical, and molecular pathologic study of 22 cases. Am. J. Surg. Pathol. 2002; 26; 654–661. 49. Bubendorf L, Kolmer M, Kononen J et al. Hormone therapy failure in human prostate cancer: analysis by complementary DNA and tissue microarrays. J. Natl. Cancer Inst. 1999; 91; 1758–1764. 50. Gaillard-SancheZ. I, Bruneval P, Clauser E et al. Successful detection by in situ cDNA hybridization of three members of the serpin family: angiotensinogen, alpha 1 protease inhibitor, and antithrombin III in human hepatocytes. Mod. Pathol. 1990; 3; 216–222. 51. Ruest LB, Khalyfa A, Wang E. Development-dependent disappearance of caspase-3 in skeletal muscle is post-transcriptionally regulated. J. Cell Biochem. 2002; 86; 21–28. 52. Ginestier C, Charafe-Jauffret E, Bertucci F et al. Distinct and complementary information provided by use of tissue and DNA microarrays in the study of breast tumor markers. Am. J. Pathol. 2002; 161; 1223–1233. 53. St Croix B, Rago C, Velculescu V et al. Genes expressed in human tumor endothelium. Science 2000; 289; 1197–1202. 54. Iacobuzio-Donahue CA, Argani P, Hempen PM, Jones J, Kern SE. The desmoplastic response to infiltrating breast carcinoma: gene expression at the site of primary invasion and implications for comparisons between tumor types. Cancer Res. 2002; 62; 5351–5357. 55. Al-Kushi A, Irving J, Hsu F et al. Immunoprofile of cervical and endometrial adenocarcinomas using a tissue microarray. Virchows Arch. 2003; 442; 271–277. 56. Simon R, Mirlacher M, Sauter G. Tissue microarrays in cancer diagnosis. Expert Rev. Mol. Diagn. 2003; 3; 421–430. 57. Schraml P, Kononen J, Bubendorf L et al. Tissue microarrays for gene amplification surveys in many different tumor types. Clin. Cancer Res. 1999; 5; 1966–1975.
58. Andersen CL, Monni O, Wagner U et al. High-throughput copy number analysis of 17q23 in 3520 tissue specimens by fluorescence in situ hybridization to tissue microarrays. Am. J. Pathol. 2002; 161; 73–79. 59. Hsu F, Nielsen TO, Al-Kushi A et al. Multiple tumor microarrays are an effective quality assurance tool for diagnostic immunohistochemistry laboratories. Mod. Pathol. 2002; 15; 1374–1380. 60. Traweek ST, Kandalaft PL, Mehta P, Battifora H. The human hematopoietic progenitor cell antigen (CD34) in vascular neoplasia. Am. J. Clin. Pathol. 1991; 96; 25–31. 61. van de Rijn M, Lombard CM, Rouse RV. Expression of CD34 by solitary fibrous tumors of the pleura, mediastinum and lung. Am. J. Surg. Pathol. 1994; 18; 814–820. 62. Kindblom LG, Remotti HE, Aldenborg F, Meis-Kindblom JM. Gastrointestinal pacemaker cell tumor (GIPACT): gastrointestinal stromal tumors show phenotypic characteristics of the interstitial cells of Cajal. Am. J. Pathol. 1998; 152; 1259–1269. 63. Miettinen M, Fanburg-Smith JC, Virolainen M, Shmookler BM, Fetsch JF. Epitheloid sarcoma: an immunohistochemical analysis of 112 classical and variant cases and a discussion of the differential diagnosis. Hum. Pathol. 1999; 30; 934–942. 64. Natkunam Y, Rouse RV, Zhu S, Fisher C, van de Rijn M. Immunoblot analysis of CD34 expression in histologically diverse neoplasms. Am. J. Pathol. 2000; 56; 21–27. 65. Packeisen J, Buerger H, Krech R, Boecker W. Tissue microarrays: a new approach for quality control in immunohistochemistry. J. Clin. Pathol. 2002; 55; 613–615. 66. Parker R, Cupples J, Lesack D, Grant D, Huntsman D, Gilks CB. Assessment of interlaboratory variation in the immunohistochemical determination of estrogen receptor status using a breast cancer tissue microarray. Am. J. Clin. Pathol. 2002; 117; 723– 728. 67. von Wasielewski R, Mengel M, Wiese B, Rudiger T, MullerHermelink HK, Kreipe H. Tissue array technology for testing interlaboratory and interobserver reproductibility of immunohistochemical estrogen receptor analysis in a large multicenter trial. Am. J. Clin. Pathol. 2002; 118; 675–682. 68. Torhorst J, Bucher C, Kononen J et al. Tissue microarrays for rapid linking of molecular changes to clinical endpoints. Am. J. Pathol. 2001; 159; 2249–2256. 69. West RB, Harvell J, Linn SC et al. APOD in soft tissue tumours: a novel marker for dermatofibrosarcoma protuberans. American Journal of Surgical Pathology; in press.
2004 Blackwell Publishing Ltd, Histopathology, 44, 97–108.