Journal of Internal Medicine 2000; 248: 271±276
INTERNAL MEDICINE IN THE 21ST CENTURY
Consulting the source code: prospects for gene-based medical diagnostics U. LANDEGREN
From the Rudbeck Laboratory, Department of Genetics and Pathology, Uppsala, Sweden
Abstract. Landegren U (Rudbeck Laboratory, Uppsala, Sweden) Gene-based diagnostics (Internal Medicine in the 21st Century). J Intern Med 2000; 248: 271±276. Gene-based diagnostics has been slow to enter medical routine practice in a grand way, but it is now spurred on by three important developments: the total genetic informational content of humans and most of our pathogens is rapidly becoming available; a very large number of genetic factors of diagnostic value in disease are being identified; and such factors include the identity of genes frequently targeted by mutations in specific diseases, common DNA sequence variants
Introduction Our cells may be profitably viewed as computers that run on software written in the ternary code of the As, Cs, Gs and Ts of DNA, and process input from within the cell and its environment. The consequence of this computation is to program cellular functions, and the extension of human life from one cell generation to another and from parent to child. Fifty years after the first glimpses of how genetic information is stored, applied and transmitted by the cell we are about to gain complete access to our own source-code as the total genetic information content of humans is being downloaded to other, more easily accessible data storage media. DNA diagnostics, initially the domain of clinical geneticists, is now an indispensable tool in an increasing number of clinical specialities, including # 2000 Blackwell Science Ltd
123
associated with disease or responses to therapy, and copy number alterations at the level of DNA or RNA that are characteristic of specific diseases. Finally, improved methodology for genetic analysis now brings all of these genetic factors within reach in clinical practice. The increasing opportunities for genetic diagnostics may gradually influence views on health and normality, and on the genetic plasticity of human beings, provoking discussions about some of the central attributes of genetics. Keywords: DNA amplification, gene diagnostics, gene ethics, genetic prediction, mutation detection.
oncology, paediatrics, transplantation medicine, infectious diseases, forensics, etc. The US National Cancer Institute has launched the Cancer Genome Anatomy Project with the transition to molecular genetic diagnostic criteria as one important objective, and molecular genetic disease classifications will take on a central role in internal medicine as genetic disease mechanisms are identified. Ongoing medical research reveals highly informative target sequences for gene diagnostic analyses, and the unit cost for bits of genetic information will decrease as the analytic capacity rapidly increases by orders of magnitude. Still, this reduction in cost per test may be more than balanced by the increased use of genetic diagnostics in clinical medicine. In the long term it may become a matter of routine to obtain a complete genetic record for every child at birth, to guide choice of lifestyle or to initiate 271
# 2000 Blackwell Science Ltd Reprinted from Journal of Internal Medicine 248
272
U. LANDEGREN
medical precautions [1]. Extensive opportunities for gene diagnostics will also be realized with the development of mature technologies to modify or add genes in patients, in order to correct errors or program entirely novel genetic properties. Here, however, genetics will be discussed in the near future perspective of a read-only memory for clinically relevant genetic factors.
Genomic progress The Genome Project The advent of molecular biology brought to biology a strong mechanistic, hypothesis-driven perspective. However, genomic research, initiated some 10 years ago, represents a return to a descriptive, LinneÂan mode of accumulating information. As of this writing 97% of the human genome sequence has been recorded in the course of large sequencing programs. Over the next few years an increasingly precise human genome sequence will be available via the Web, including the coding sequence of all human genes, complete with regulatory regions directing their expression. In parallel with the DNA sequencing of one or a few human genomes, half a million of all those places where human genomes commonly differ from one another will have been identified and quickly published on the Internet by early 2001 [2]. Also, the genomes of pathogenic bacteria are being characterized to the last nucleotide, and this is true also for an increasing number of eukaryotic parasites. Functional genomics The focus of research interests is now shifting to the accumulation of global information about how genomes encode the properties of proteins, cells and organisms, referred to as functional genomics [3]. Over the next 10-year period these efforts should provide invaluable information about the structure of most or all human proteins, their interaction partners, and the connectivity of all metabolic and signal transmission chains, thereby defining the modules that perform the various cellular functions [4]. In the process, analytical tools will be developed for efficient studies of all these properties; tools that will gradually become available also for diagnostic applications.
The complete genome sequences of man and his parasites will surely prove instrumental for elucidation of disease mechanisms. As more and more disease states are explained by applying the tools of genetics to patient samples, one of the early medical beneficiaries will be the field of gene diagnostics.
The search for molecular lesions Early last century Garrod observed that alcaptonuria is inherited in a mendelian fashion, indicating that discrete genetic factors can underlie disease [5]. At the middle of the century, Linus Pauling and coworkers demonstrated the molecular nature of another defect, the beta globin aberration in sickle cell anaemia [6]. Then, in the late 1970s the corresponding DNA sequence alteration became the target for a primordial molecular genetic diagnostic test by Kan and Dozy [7]. Currently, medical research yields an abundance of genetic factors of value as targets for diagnostic investigations. The various types of analyses commonly used to investigate such factors may be grouped as follows. Resequencing Single or small sets of genes are being identified that are frequently found to be mutated in particular diseases, and that may therefore need to be scrutinized in individual patients for the cellular equivalents of software bugs. This requires resequencing the relevant gene in individual patients to reveal any deviations from one or a few known normal variants of the sequence, followed by the frequently difficult evaluation of what role such changes might have in the disease. Genotyping Certain human DNA sequence variants are associated with particular diseases, or with responses to therapy, or are otherwise interesting, e.g. as markers in forensic analyses. Genotyping of such markers may therefore be of value in predicting or diagnosing diseases, and in the near future it may provide a basis for personalized drug prescriptions by guiding the selection of drugs or dosage. In oncology, unique genetic lesions may be demonstrated in malignancies of individual patients and used to guide
# 2000 Blackwell Science Ltd Journal of Internal Medicine 248: 271±276
# 2000 Blackwell Science Ltd
Reprinted from Journal of Internal Medicine 248
124
INTERNAL MEDICINE IN THE 21ST CENTURY construction of patient-specific genotyping reagents, to apply the exquisite specificity and sensitivity of DNA tests in order to investigate spread of disease or to search for any cells remaining after treatment. Transcript profiling Measurements of the level of expression of large sets of gene transcripts provides a view of the gene programs currently running in the cells of a sampled tissue, and thus offer a valuable view both of the state of differentiation and activity of a tissue sample. The analysis provides a useful basis for diagnosis in, e.g., malignancy [8], and allows monitoring of the progress of disease or response to therapy. After a phase of analyses of transcription of thousands or tens of thousands of genes, probably more limited sets of genes will be defined that are highly informative for disease diagnosis and followup. Expression measurements are likely to be easily accepted in clinical diagnostics as they do not confer the stigma that may be attached to congenital genetic lesions (see further below). Moreover, from a commercial standpoint tests that can be usefully repeated may have advantages over genotyping specific DNA sequence variants, something that will typically be performed only once per patient. Gene quantification Copy number changes of specific gene regions in malignancy can reflect loss of genetic factors that protect against excessive proliferation or, conversely, gain of copies of genes that can support such proliferation, evident as increased copy numbers of certain chromosomal regions. Regions of the genome present in increased or decreased copy numbers can be demonstrated by comparing genomic DNA samples from normal and malignant tissues through a process called comparative genome hybridization [9].
273
informative when combined with in situ hybridization using specific DNA probes. Also, in histopathology, in situ analysis of the distribution of RNA and DNA sequences provide more specific information. Imprinting of genomic DNA, reflected as altered patterns of methylation, can play a role in determining the activity of nearby genes and can influence malignant transformation in certain tumours, providing diagnostic value [10]. The ability to expand DNA sequence elements at the ends of chromosomes, telomers, is characteristic of a limited number of stem cells but also of many malignant cells, and this activity can therefore betray the presence of malignant cells in tissue samples [11]. Any and all identified genetic effects associated with disease, whether causative or secondary, represent potential targets for diagnostics.
Modalities and methods for genetic diagnostics From homebrews to big pharma Current clinical gene-based diagnostic analyses are frequently of a type referred to in the industry as homebrew-tests, developed and adapted by the individuals offering this diagnostic service. This state of affairs must change as both the scale of analyses and the accountability of the diagnostic labs increase. One prominent driving force for the development of improved technology for gene diagnostics is the requirements by pharmaceutical and genomics companies for very large-scale genetic analyses. The purpose for the industry is to identify promising new target molecules for pharmaceutical intervention, to evaluate new candidate drugs or to stratify patients according to genetically based differences in therapeutic responses. At a somewhat slower rate, such improved tests also become available for medical routine analyses. Centralized labs and bathroom cabinets
And much more Other classes of molecular genetic analyses are known to be informative about predisposition for, or state and progress of, disease. For example, gross structural changes of chromosomes can be of diagnostic value in congenital disease and in malignancy. Here, cytogenetic analyses are far more
Future genetic tests will range from highly specialized analyses, centralized to a limited number of labs, to doctors office tests that will serve to guide drug prescription, diagnose infectious agents or measure disease activity, preferably whilst the patient remains in the consultation room. Tests will also become available for routine check-ups, for
# 2000 Blackwell Science Ltd Journal of Internal Medicine 248: 271±276
125
# 2000 Blackwell Science Ltd Reprinted from Journal of Internal Medicine 248
274
U. LANDEGREN
instance by searching for genetic changes in cells exfoliated from lung or bladder epithelia, permitting early diagnosis of malignancies [12]. Simplified tests may also be formatted to allow patients to diagnose specific infections themselves, using simple DNA diagnostic kits for use at home. Diagnostic technologies Routine diagnostics of human gene sequences first became possible with the advent of the Southern blot in the mid-1970s [13]. Opportunities further improved dramatically 10 years later with the polymerase chain reaction (PCR), which offers the exquisite specificity and sensitivity required to conveniently identify unique DNA sequences [14]. The specificity of the procedure derives from the requirement for coordinated target recognition by two short DNA probes, and this sets off an exponential amplification, resulting in the highly sensitive target detection. Since the mid-1990s an increasing number of genetic analyses are performed using microprocessor-like DNA microarrays [15]. Such arrays feature large sets of DNA hybridization probes distributed in a microscopic pattern on planar glass or silica surfaces. A DNA or RNA sample from a patient added to such arrays can be simultaneously interrogated with respect to large numbers of target sequences. Genetic analyses that have been adapted to the microarray format include resequencing, genotyping and measurements of gene expression and of copy number of genetic regions through comparative genomic hybridization. Recent years have seen a flurry of gene diagnostic technological development improving various steps of the analyses [16]: The Invader technique [17] and rolling circle amplification [18] offer new means to obtain amplified detection signals. A number of homogeneous visualization techniques, mostly based on fluorescence resonance energy transfer or fluorescence polarization spectroscopy, simplify analyses by avoiding the requirement for postPCR processing. Mass spectrometry and the luciferase reaction, finally, furnish efficient and precise means to record the results of genetic analyses. Future directions One of the important challenges in gene diagnostics
will be to devise probing strategies that combine the specificity and sensitivity of PCR with the ability of DNA hybridization microarrays to simultaneously investigate large numbers of target sequences. This combination will support downloading information from our cells about thousands of genetic variants, or the expression of thousands of genes. In this regard, my own research group attaches great hopes to the so-called padlock probes (Fig. 1) that seem particularly suited for highly specific, parallel analyses [19]. In the presence of the appropriate target molecule the two target-complementary ends of these linear DNA probes can be connected to form circular DNA strands, locked around their target sequences (Fig. 1). Future genetic diagnostics is likely to indiscriminately access the required information not just at the level of DNA or RNA, but also at levels of protein or protein function. As all proteins encoded in the human genome are being defined, a logical next step will be to develop specific affinity reagents for most or all proteins. Entirely new analytic concepts will be required in order to analyse large numbers of proteins that may occur at widely different concentrations. Here, some of the experiences from DNA diagnostic technology will prove instructive. My group is currently developing a class of reagents, termed proximity probes, that give rise to a detectable signal only if pairs of affinity probes have detected a target protein in a coordinated reaction, the coincident binding giving rise to an amplifiable detection signal (Fredriksson et al., in preparation).
Something about the ethics of gene diagnostics In anticipation of a greatly increased use of DNAbased diagnostics, it is useful to consider some peculiar properties of genetics. Discrimination Genetic analysis serves to identify differences between species and individuals within a species, with the purpose of correlating this to functionally or clinically interesting differences. As it applies to man, this activity tends to provoke at least two distinct types of conflict: there is concern that knowledge of the genetic basis of differences in propensity to disease, athletic proficiency or beha-
# 2000 Blackwell Science Ltd Journal of Internal Medicine 248: 271±276
# 2000 Blackwell Science Ltd
Reprinted from Journal of Internal Medicine 248
126
INTERNAL MEDICINE IN THE 21ST CENTURY
275
Fig. 1 A so-called padlock probe is a linear DNA strand (shown in yellow) with sequences at the two ends that allow the probe to base-pair to a target sequence (shown in blue). If the sequences have been selected so that the two probe ends are brought next to each other upon base pairing, then the 59 phosphate group (shown in red) at one end of the molecule can be covalently connected to the 39 hydroxyl at the other end of the probe by the enzyme DNA ligase. Thereby the probe is converted to a circular molecule, topologically linked to the target sequence.
vioural properties, will emphasize differences between individuals and perhaps also between populations. Increased genetic analysis could become associated with exaggerated interpretations or exploitation of any such differences, and to genetic exclusion from the workplace or from health insurance. On the other hand, the considerable economic value attributed to human genetic variation in the search for disease-associated genetic variants, e.g. by the pharmaceutical industry, raises issues about the rights to commercial exploitation of such findings. This question has global political dimensions. Normative An issue of normality is central to the way we use genetic terminology and practise gene diagnostics. The act of offering prenatal diagnostics for a particular condition will be perceived as a statement by the health care provider that individuals with this diagnosis lead lives that are in some sense inferior to those of healthy individuals. On the other hand, it is difficult to delegate all responsibility for what tests
are performed to the parents (mothers) for both economical and ethical reasons. Predictive The predictive nature of genetic tests for monogenic conditions that sometimes will only manifest long after the diagnosis is made represents a considerable problem in medicine. By contrast, the predictive value of genetic risk factors for most common diseases are likely to prove substantially weaker and more complex. Nonetheless, it is probable that large numbers of factors will be identified with considerable predictive value for the development of disease, and also for other properties such as personality traits, sexual orientation, etc. A particular problem lies in the fact that we may not always be able to tell just what a test will predict. ApoE tests, initially offered to evaluate risks of diseases of lipid metabolism, were later shown to predict the level of risk for Alzheimer's disease, thrusting this information on individuals who never agreed to be tested for risks of a neurodegenerative disease.
# 2000 Blackwell Science Ltd Journal of Internal Medicine 248: 271±276
127
# 2000 Blackwell Science Ltd Reprinted from Journal of Internal Medicine 248
276
U. LANDEGREN
In the longer term an increased use of genetic tests may fundamentally influence views on predetermination and free will. Manipulative Finally, the potential of modern genetics not only to record genetic information but also to write new information into our genomes will have consequences for how we view man. In a longer time frame we can expect dramatically improved means of adding genes that are missing or deficient in an individual, or in some cells in the individual. If desired, genes that are not part of the normal complement can also be encoded, even whole new chromosomes might be created to introduce suites of improved or entirely novel genes, complete with any regulatory instructions. A requirement for DNA literacy With medicine and medical diagnostics increasingly based on the code that programs our cells, there will be a great need for competent interpretation by genetic counsellors. However, all medical practitioners must learn to select, evaluate and act upon results of a wide variety of genetic tests that now become available. In fact, proper use of the farreaching gene diagnostic opportunities will require an increased awareness by the general public as well of the information available in the source code of man, and the choices that gene diagnostics places before us.
Acknowledgements Dr Mats Nilsson provided helpful criticism of the manuscript and Tomas Hansson prepared the illustration. Work in my group is supported by grants from the Beijer Foundation, from the Swedish Medical and Technological Research Councils, and from the Swedish Cancer Fund.
2 Kwok PY, Gu Z. Single nucleotide polymorphism libraries: why and how are we building them? Mol Med Today 1999; 5: 538±43. 3 Vukmirovic OG, Tilghman SM. Exploring genome space. Nature 2000; 405: 820±2. 4 Hartwell LH, Hopfield JJ, Leibler S, Murray AW. From molecular to modular cell biology. Nature 1999; 402 (Suppl.): C47±C52. 5 Garrod AE. The incidence of alkaptonuria: a study in chemical individuality. Lancet 1902; ii: 1616±20. 6 Pauling L, Itano HA, Singer SJ, Wells IC. Sickle cell anemia: a molecular disease. Science 1949; 110: 543. 7 Kan YW, Dozy AM. Polymorphism of DNA sequence adjacent to human beta-globin structural gene: relationship to sickle mutation. Proc Natl Acad Sci USA 1978; 75: 5631± 5. 8 Alizadeh AA, Eisen MB, Davis RE et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 2000; 403: 503±11. 9 Forozan F, Karhu R, Kononen J et al. Genome screening by comparative genomic hybridization. Trends Genet 1997; 13: 405±9. 10 Ogawa O, Eccles MR, Szeto J et al. Relaxation of insulin-like growth factor II gene imprinting implicated in Wilms' tumour. Nature 1993; 362: 749±51. 11 Buys CH. Telomeres, telomerase, and cancer. New Engl J Med 2000; 342: 1282±3. 12 Fliss MS, Usadel H, Caballero OL et al. Facile detection of mitochondrial DNA mutations in tumors and bodily fluids. Science 2000; 287: 2017±9. 13 Southern EM. Detection of specific sequences among DNA fragments separated by gel electrophoresis. J Mol Biol 1975; 98: 503±17. 14 Saiki RK, Scharf SJ, Faloona FA et al. Enzymatic amplification of beta-globin genomic sequences and restriction site analysis for diagnosis of sickle cell anemia. Science 1985; 230: 1350±4. 15 Lander ES. Array of hope. Nat Genet 1999; 21 (Suppl.): 3±4. 16 Landegren U, Nilsson M, Kwok P-Y. Reading bits of genetic information: methods for single-nucleotide polymorphism analysis. Genome Res 1998; 8: 769±76. 17 Lyamichev V, Mast AL, Hall JG et al. Polymorphism identification and quantitative detection of genomic DNA by invasive cleavage of oligonucleotide probes. Nat Biotechnol 1999; 17: 292±6. 18 Lizardi PM, Huang X, Zhu Z et al. Mutation detection and single-molecule counting using isothermal rolling-circle amplification. Nat Genet 1998; 19: 225±32. 19 Nilsson M, Malmgren H, Samiotaki M et al. Padlock probes: circularizing oligonucleotides for localized DNA detection. Science 1994; 265: 2085±8. Received 10 August 2000; accepted 14 August 2000.
References 1 Sander C. Genomic medicine and the future of health care. Science 2000; 287: 1977±8.
Correspondence: Dr Ulf Landegren, Rudbeck Laboratory, Uppsala University, SE-75185 Uppsala, Sweden (fax: 146 18 471 4910; e-mail:
[email protected]).
# 2000 Blackwell Science Ltd Journal of Internal Medicine 248: 271±276
# 2000 Blackwell Science Ltd
Reprinted from Journal of Internal Medicine 248
128