REVIEWS
Precision Medicine in the Age of Big Data: The Present and Future Role of Large-Scale Unbiased Sequencing in Drug Discovery and Development P Vicini1, O Fields1, E Lai2, ED Litwack3, A-M Martin4, TM Morgan5, MA Pacanowski3, M Papaluca6, OD Perez1, MS Ringel7, M Robson5, H Sakul1, J Vockley8, T Zaks9, M Dolsten1 and M Søgaard1 High throughput molecular and functional profiling of patients is a key driver of precision medicine. DNA and RNA characterization has been enabled at unprecedented cost and scale through rapid, disruptive progress in sequencing technology, but challenges persist in data management and interpretation. We analyze the state-of-the-art of large-scale unbiased sequencing in drug discovery and development, including technology, application, ethical, regulatory, policy and commercial considerations, and discuss issues of LUS implementation in clinical and regulatory practice. “Predictive, preventive, personalized, and participatory”1 medicine, particularly patient genotype and phenotype-directed approaches, holds great promise for improving human health and optimizing the development and clinical use of medications. Increasingly, biomedical research is focused on the interplay of in-depth phenotyping and genetic, molecular profiling (including omics) with outcomes. Precision medicine—the use of comprehensive genomic, transcriptomic, and proteomic or even “pan-omic” characterization of patients to guide medical decisions—is a key step towards truly personalized medicine.2 Precision medicine has been envisaged3 as the combination of biomarker and molecular information with a specific clinical phenotype at the individual patient level, and should provide strong synergy with complementary approaches such as therapeutic drug monitoring4 and routine clinical risk assessments (e.g., in cardiovascular disease). From a pharmacotherapeutic perspective, one of the goals of precision medicine is to improve the therapeutic index5 of drug products in a given patient population by increasing the probability of efficacy, decreasing the probability of serious adverse events, or, more desirably, achieving both. Biomarker-directed approaches to drug and/or dose selection, specifically through genomic testing, are at the foundation of modern precision medicine, as exemplified by the proliferation of targeted therapies,
especially in oncology.6 Compelling examples of predictive genetic information are starting to appear outside of oncology7 and provide the motivation for applications of precision medicine in other disease areas. According to the Personalized Medicine Coalition, therapeutic and diagnostic examples of personalized medicine grew from 13 in 2006 to 113 in 2014. Most of the advances noted above were made on the basis of ordinary approaches to genetics and physiology, not necessarily cutting-edge large-scale molecular genomic studies.2 In these examples, patients are typically selected with companion diagnostics supporting a “one-biomarker, one-drug” approach. As we progress in the understanding of multiple biomarker interactions with the clinical phenotype, the next wave of diagnostics will likely include biomarker panels and large-scale sequencing approaches. The sheer volume of patient data generation is rapidly outgrowing the scientific and computational resources that are needed for accurate interpretation and subsequent reporting of test results to support clinical decision-making. As such, the entire current diagnostic instrumentation and processing infrastructure may need to change so that its workflow better supports “big data” testing paradigms. Several factors increase the urgency for precision medicine strategies and implementation in drug development and clinical
1
Pfizer Worldwide Research & Development, La Jolla, California, Collegeville, Pennsylvania, and New York, New York, USA; 2Takeda Pharmaceuticals International, Deerfield, Illinois, USA; 3Food and Drug Administration, Silver Spring, Maryland, USA; 4GlaxoSmithKline, Collegeville, Pennsylvania, USA; 5 Novartis Institutes for Biomedical Research, Cambridge, Massachusetts, and East Hanover, New Jersey, USA; 6European Medicines Agency, London, UK; 7 Boston Consulting Group, Boston, Massachusetts, USA; 8Inova Translational Medicine Institute, Falls Church, Virginia, USA; 9Sanofi, Cambridge, Massachusetts, USA. Correspondence: M Søgaard (
[email protected]) Received 14 October 2015; accepted 30 October 2015; advance online publication 4 November 2015. doi:10.1002/cpt.293 198
VOLUME 99 NUMBER 2 | FEBRUARY 2016 | www.wileyonlinelibrary/cpt
REVIEWS
Figure 1 Graphical summary of the temporal evolution of genetic sequencing technology, including the increase in throughput and the decrease in cost.
practice.8 These include advocacy from precision medicine stakeholders, including patients, physicians, regulators, and payers, based on the desire to increase treatment response rates and minimize therapeutic risks; a growing appreciation of disease complexity and heterogeneity; decelerating pharmaceutical research and development productivity; and financial constraints that require demonstration of added value in healthcare.9 Despite the advances in technology development (Figure 1) and the emerging clinical applications for precision medicine, significant practical obstacles remain as barriers to progress and true change. Contributing forces are both cultural and technological: medical practice and translational science do not easily keep pace with rapid technological advances. In this article we review and address some of the important challenges for precision medicine in the era of “big data,” including practical and ethical aspects. ROLE OF DIAGNOSTICS IN DRUG DEVELOPMENT AND HEALTHCARE DELIVERY
To understand the challenges related to large-scale “omics”-based personalized medicine, it is helpful to review the state of diagnostics, especially companion diagnostics. Companion diagnostics tests provide information that is essential for the safe and effective use of a corresponding therapeutic product. They may be codeveloped—in some cases—and co-marketed with the therapeutic agent.10 Various well-established technologies can be used for companion diagnostics, e.g., immunohistochemistry, immunoassays, and molecular diagnostic assays (e.g., polymerase chain reaction (PCR)). The added value of diagnostics closely integrated with targeted therapeutics is, at least in principle, three-fold: for patients, there is an increased likelihood that the treatment will be safe and effective; for diagnostic companies, the benefit lies in
the ability to gain market access for their patient selection tools; for pharmaceutical companies, the advantage is in creating drugs that can be targeted to patients who will respond, thus indicating a clearer path to an optimized benefit/risk profile, and therefore to approval and reimbursement. Industry activity aimed at developing and deploying companion diagnostics to facilitate precision approaches to therapy has grown substantially in the last decade, particularly within oncology. In theory, biomarker-directed approaches, such as those included in the Lung Cancer Master Protocol (http://lung-map. org/), should accelerate drug development by enabling enrichment of responder populations in clinical trials (i.e., making for smaller and shorter trials), and improving the benefit/risk ratio of the therapeutic; while crizotinib became available to patients 15 years11 after basic science research began, it took only about 6 years of clinical development to reach the markets. Many other programs leveraged the ability to target specific subsets of patients to run smaller and faster trials, including vemurafenib and dabrafenib for the BRAF V600 mutation(s) in melanoma,12 ivacaftor for G551D and other CFTR mutations in cystic fibrosis,7 and afatinib and erlotinib for EGFR mutations in lung cancer.13 Beyond co-development and approval of a specific drugdiagnostic pair, genomic markers for treatment response have been identified for numerous drugs, many of which have been in use for decades. There are examples of clinical uptake of genetic testing outside of oncology (e.g., HLA-B*5701 for abacavir) and several programs aimed at clinical implementation of genomic testing have been established in the past few years (see, for example, http://www.ignite-genomics.org/). One study among many designed to bridge the chasm, the 1200 Patients Project at the University of Chicago,14 is using preemptive, comprehensive
CLINICAL PHARMACOLOGY & THERAPEUTICS | VOLUME 99 NUMBER 2 | FEBRUARY 2016
199
REVIEWS pharmacogenetic genotyping to integrate genetics in medical practice. Vanderbilt University Medical Center’s PREDICT (Pharmacogenomic Resource for Enhanced Decisions in Care & Treatment) project prospectively genotypes patients for 184 common polymorphisms within 34 genes associated with drug absorption, distribution, metabolism, and excretion (ADME) and uses genotyping for point-of-care clinical decisions about prescribing of drugs including warfarin, clopidogrel, thiopurine, and tacrolimus.15 Approval of a therapeutic product for a limited subset of patients or identification of a pharmacogenetic biomarker alone is not sufficient to ensure testing or even use of the drug product in clinical practice. In the clinical context, the utility of a genomic test encompasses not only whether it significantly improves patient outcomes (http://www.cdc.gov/genomics/gtesting/ACCE/), but also the overall treatment cost-effectiveness, which includes a number of elements related to the biomarker characteristics (including prevalence in the population, predictive value, penetrance, clinical specificity and sensitivity), cost, and availability of validated platforms, expertise, and turnaround time for clinical interpretation. With so many tests now emerging, many for the same biomarker, a shift in the diagnostic landscape may be necessary to ensure quality and efficiency. Beyond targeted sequencing, broad deployment of large-scale unbiased methods might provide the data to address some of these challenges, thus simplifying the landscape of companion diagnostics. OVERVIEW OF LARGE-SCALE UNBIASED SEQUENCING
The high-throughput methods necessary to execute Large-scale Unbiased Sequencing (LUS) comprise the family of technologies known as Massively Parallel Sequencing or Next-Generation Sequencing (NGS) (we will use these terms interchangeably in the text). Several related but distinct technologies comprise NGS and differ by the extent and coverage of genomic content assessed. For the purposes of our discussion, LUS includes the following techniques: Whole Genome Sequencing (WGS), which comprises the totality of the genome, both coding and noncoding DNA; Whole Exome Sequencing (WES), which is limited to sequencing protein-coding DNA (about 1.5% of the whole genome); and the body of techniques comprising Whole Transcriptome Analysis, focusing on RNA transcripts (also known as RNA-Seq). Unlike LUS, targeted sequencing methods, which are the basis of most clinical diagnostics in use today, focus on a defined set of genes or region of the genome that is sequenced at very high depth with complete coverage of the target region to obtain high sensitivity for mutation detection as well as high confidence that observed rare mutations are valid, and/or to detect DNA sequence reads that exist only in low abundance (e.g., mosaic mutations present in only a subset of the cells contributing DNA to a sample).16 Within the field of oncology, targeted sequencing can accommodate panels of genes or mutations that can be customized to specific diseases where a small number of mutations are seen in a significant proportion of patients (e.g., EGFR, ALK, cMET, KRAS, BRAF, and ROS1 in lung cancer). NGS panels that comprise a set of actionable genes have had significant 200
impact in oncology, and attracted the collaboration of several large pharmaceutical companies for development into companion diagnostics. NGS technologies provide an additional set of novel opportunities and challenges for diagnostics and companion diagnostics. LUS approaches, especially WGS and WES, are beginning to show promise as clinical diagnostics tools (e.g., at academic medical centers). As technology continues to improve, it is conceivable that the next few years will witness a significant enough decline in per-genome sequencing costs to enable widespread adoption of these technologies in the clinic and in clinical trials. In principle, LUS could be a powerful alternative to the conventional practice of developing a single-marker diagnostic when required by a latestage drug development project. The development of larger, comprehensive diagnostic panels, with potential utility across pharmaceutical agents and potentially across disease indications, could transform this landscape. CURRENT RESEARCH AND CLINICAL APPLICATIONS OF LUS
LUS approaches have been used extensively for discovery and for insight into disease mechanisms. From the research perspective, the availability of WGS or WES data during drug discovery and development can enable selection of novel targets and detection of new safety and efficacy biomarkers, and confirm mechanisms of action, mechanisms of resistance, and disease pathogenesis. International projects, such as the Cancer Genome Atlas,17 the 1000 Genomes Project Consortium,18 and the 100,000 Genome Project (100kGP)19 aim to capture gene–disease relationships using large samples and a variety of platforms and sequencing approaches, to provide essential information about associations of genetic variation with diseases. DNA sequencing of somatic mutations giving rise to large effect size, as in the case of autism, or rare protective variants, such as the role of PCSK9 in regulating low-density lipoprotein (LDL) cholesterol levels, may provide important insights about disease and potential drug targets. In the case of PCSK9, the discovery of mutations in this gene formed the basis for the recent development of drugs that target this protein to treat dyslipidemias. Recent reports provide tantalizing advances for determining the genetic basis of challenging diseases.20 The sequencing and phenotyping of individuals from consanguineous and genetically bottle-necked populations will likely lead to identification of human “knockouts” for most nonessential human genes.21 In other diseases, where multiple genes with small effect size are identified e.g., from genome-wide association studies (GWAS), as done recently for schizophrenia, the likelihood is influenced by several variations and these are difficult to associate with nonMendelian disease or clinical outcomes. Information about environmental and other kinds of influences is partly captured in the transcriptome, which can be explored through RNA-Seq, part of the LUS family of technologies. RNA-Seq signatures are beginning to be studied (e.g., in some types of breast cancer) and could offer earlier opportunities for combining genomic and transcriptomic22 data. RNASeq analysis of discrete subsets of peripheral VOLUME 99 NUMBER 2 | FEBRUARY 2016 | www.wileyonlinelibrary/cpt
REVIEWS blood mononuclear cells in autoimmune disease patients may be a particularly promising area of study. On the clinical practice front, WES in oncology has enabled detection of somatic mutations and guided clinical trial enrollment in at least one case.23 Researchers at Washington University have used WGS to diagnose a challenging leukemic disorder, showing that these data can be gathered and analyzed within a time frame compatible with clinical decision-making.24 Clinical applications of LUS take place outside of oncology as well, with rare diseases, neuroscience, and infectious diseases being obvious candidates for more widespread applications. Genetic diagnosis results have been reported in inflammatory bowel disease25 and rare diseases, such as dopa-responsive dystonia26 and congenital chloride diarrhea,27 with some of these analyses leading to unanticipated clinical decisions and benefit to the patient which would have been unfeasible otherwise. In the case of congenital chloride diarrhea, only single exome sequencing was necessary in order to obtain a valid but unanticipated diagnosis in a patient who had been suspected of having renal salt wasting due to Bartter syndrome. The markedly improved sensitivity of LUS thus presents an unprecedented opportunity for investigation of rare phenotypes and extreme responders, e.g., in rare adverse drug reactions. Finally, the sequencing data can become an inherent part of the new therapeutic, as is seen with current personalized immunization approaches that use patient-specific mutation data to generate personalized vaccines encoding putative mutationspanning antigens.28 Timely pediatric29 and neonatal30 diagnoses and care also stand to benefit from an efficient and robust application of NGS approaches, providing an opportunity for early intervention in disorders that would likely present with life-threatening endorgan damage only later (e.g., hypertrophic cardiomyopathy), with the lysosomal storage disorder Pompe disease being an example for which US Food and Drug Administration (FDA)approved therapy is currently available in infants with cardiomyopathy.31 Such applications are approaching standard medical practice at major medical centers for neonates and infants with poorly understood phenotypes. Results of such studies may open up time-limited windows of opportunity for presymptomatic or early symptomatic therapies for childhood-onset disorders and usher in an era of genomic newborn screening to complement the standard biochemical newborn screening for selected inborn errors of metabolism. That being said, applications in pediatrics can be particularly challenging from an ethical and informed consent perspective.32 Pilot studies to determine the feasibility, utility, and acceptability of WGS of newborns are ongoing. Myriad technical challenges exist with respect to using LUS in the clinic and in drug discovery and development. Failure to achieve the required level of accuracy—the ability to correctly identify a disease-causing allele at a particular position within the genome—results in poor-quality sequencing, translating into wasted research dollars and possible misdiagnosis.33,34 This is determined in part by analytic sensitivity and specificity, precision, quality of reference information, and bioinformatics/ sequence alignment methods and requires 1) laboratory standards, 2) a reliable computing and bioinformatics environment
that aligns and the sequence to the right ethnic-specific reference genome, and 3) the ability to distinguish between a common benign variant and a disease-causing variant based on statistical, clinical, and biological evidence.35 Confirmatory testing of NGS is currently a standard practice and it seems likely that validation of suspected disease-causing variants will continue to be required in the future. With respect to laboratory standards, most WGS is generated using methods that can be modified by the user.36 Many scientists feel that it is their responsibility to modify sequencing protocols in an attempt to provide the best sequencing results possible. Just as microarray data vary by the method used to isolate mRNA and generate labeled cDNA,37 WGS results vary depending on the modifications made to the protocol that has been developed, tested, and validated by the manufacturer.38,39 This variability translates into significant differences among WGS generated by different laboratories.38 Along with standard process control methods utilized by reference laboratories in a Clinical Laboratory Improvement Amendments (CLIA) environment, it is important to establish a set of version-controlled standard operating procedures by which a reference laboratory generates reproducible LUS.40 Ongoing efforts at the National Institute of Standards and Technology are attempting to address this source of variability39; however, standardization should start with the bio-banking of samples and continue through data analysis and interpretation. Public domain informational databases are now full of data that do not contain sufficient metadata to accurately determine how the sequence was generated. The need for a facility to be equipped with the laboratories necessary to provide testing for precision medicine has also to be matched by a significant investment in informatics data processing and data storage infrastructure. The computational and bioinformatics infrastructure, including hardware, software, and skilled analysts, to process these data is becoming broadly available, but informatics processing and data storage costs can require high capital investment33 and cross-platform standardization is lacking.38 Computer storage and processing requirements for WGS and WES dwarf (by a factor of 1,000 and 10, respectively) those for a routine clinical targeted gene panel,41 leading to more complexity in data storage and analysis. Analyzing these large datasets can involve either matching the reads (millions to billions for a typical run) with an available reference database (e.g., from the Genome Reference Consortium, or the Global Alliance for Genomics and Health, http://genomicsandhealth.org/, with debate as to the best method ongoing) or assembling the genome de novo, which is much more computationally taxing.42 Available bioinformatics tools and techniques for LUS datasets are being intensely explored43 and have been reviewed elsewhere. The diversity of bioinformatics tools has contributed to the difficulty in developing standardized analysis approaches. For example, a recent comparison of variant detection methods demonstrated a low concordance of 57.4% detection of known variants among the five methods tested.44 The number of individual variations detected in a WGS dataset approaches 3 million33—the analysis of such a dataset can be taxing with current methods. Currently, even the highest level of process control generates a whole
CLINICAL PHARMACOLOGY & THERAPEUTICS | VOLUME 99 NUMBER 2 | FEBRUARY 2016
201
REVIEWS genome sequence with up to 5 million potentially significant variants.36 The large number of variants results from using a single reference genome as a scaffold to assemble genomes and establish the “common” sequence. The current reference genome lacks the statistically significant ancestral and familial related information required to determine an individual’s DNA sequence.34 Filtering out erroneous and insignificant sequence variants requires the use of analytic tools, some that have been developed utilizing GWAS data of low-pass sequence or array-based single nucleotide polymorphism (SNP) data.45,46 In addition, databases of clinical and biological data derived from these low-accuracy methods, such as ClinVar, dbGAP, and HGMD, lack ancestral and familial sequence-dependent minor allele frequencies. Despite this fact, they have become the standard reference information by which sequence significance is established.47 There is an urgent need for an improved, highly curated database of the biological significance of genomic variation, e.g., building on ClinVar and ClinGen.48 This will be important not only for advancement of biomedical science, but also for safe treatment of patients and drug development in the genomic age.49 REGULATORY, POLICY, AND COMMERCIAL ENVIRONMENT FOR LUS
Regulatory guidelines for companion diagnostics continue to evolve globally.50 In the US, under the Federal Food, Drug, and Cosmetic Act (the FD&C) Act, the FDA is responsible for regulating in vitro diagnostic (IVD) tests and their components and accessories intended for use in the diagnosis of disease or other conditions, or in the cure, mitigation, treatment, or prevention of disease, to ensure these tests are reasonably safe and effective. The FDA applies a risk-based approach to determine the appropriate regulatory pathway for IVDs, as it does with all medical devices. This means the type of premarket submission (e.g., a premarket approval application [PMA] or a premarket notification [510(k)]) will depend on the level of risk to patients, based on the intended use of the IVD device and the controls necessary to provide a reasonable assurance of safety and effectiveness. Experience indicates that most IVD companion diagnostic devices are classified as Class III, or “high-risk.” The FDA has classified NGS instruments for targeted sequencing of human genomic DNA as a Class II exempt device, meaning that premarket notification is not required. As of this writing, the Illumina MiSeqDx, the Thermo Fisher Scientific Ion PGM Dx System, and the Vela Diagnostics NGS Sentosa have registered and listed with the FDA. Under the FD&C Act, the FDA assures both the analytical performance and effectiveness of IVDs. In the US, clinical testing is also performed using laboratory-developed tests (LDTs), which are a subset of IVDs. The FDA has generally not enforced premarket review and other applicable FDA requirements because LDTs were relatively simple lab tests and generally available on a limited basis. But, due to advances in technology and business models, LDTs have evolved and proliferated significantly since the FDA first obtained comprehensive authority to regulate all in vitro diagnostics as devices in 1976. Some LDTs are now more complex, have a nationwide reach, and present higher risks, such 202
as detection of risk for breast cancer and Alzheimer’s disease, which are similar to those of other IVDs that have undergone premarket review. The Center for Medicare and Medicaid Services (CMS) is responsible for regulating laboratories, including those that develop LDTs, under the 1998 Clinical Laboratory Improvement Amendments (CLIA). CLIA governs the accreditation and certification process for laboratories. Because they focus on laboratory quality and not IVD design and development, CLIA requirements address different functions than the requirements under the FD&C Act. Namely, CLIA requirements address the laboratory’s testing process (i.e., the ability to perform laboratory testing in an accurate and reliable manner). Under CLIA, accreditors do not evaluate test validation prior to marketing nor do they assess the clinical validity of an LDT (i.e., the accuracy with which the test identifies, measures, or predicts the presence or absence of a clinical condition or predisposition in a patient). As a result, while CLIA oversight is important, it alone does not assure that LDTs are properly designed, consistently manufactured, and are safe and effective for patients. In Europe, at present companion diagnostics are not specifically addressed in the Directive on In Vitro Diagnostics and the pathway to approval is the same for other in vitro diagnostics. Special provisions for companion diagnostics are foreseen in the new Regulation on In Vitro Diagnostics currently debated at the level of the European Parliament and Council. A risk-based approach is proposed, with companion diagnostics ranking as class C (high risk) requiring premarketing evaluation of clinical data in consultation with the pharmaceuticals regulatory authorities responsible for the relevant medicinal product approval and oversight. In Europe, the European Medicines Agency (EMA) will be the authority to be consulted for regulating companion diagnostics of medicinal products falling by law under the mandatory areas of the Centralized Procedure, including oncology, diabetes, neurodegenerative disease, immunological diseases, viral diseases, HIV and AIDS, orphan medicines, cells and gene-based medicines, and for all pharmaceuticals based on r-DNA and other biotechnology methods.51 Oncology remains the area where more medicines targeted with the use of biomarkers have been approved and used, and for which the impact of the revised legislation on companion diagnostics might be great in the short term.52 In the evaluation of medicinal products, the EMA accrued extensive experience53 on the impact of genomics and generated a number of recommendations relevant to the use of genomics data for medicines life-cycle development and management, preand postauthorization.54 In addition to further supporting a reliable and acceptable drug development based on biomarkers and other novel methods, the EMA in 2010 launched a new process, in the context of the scientific advice for drug development specifically aimed at the regulatory qualification of novel methodologies for drug development. In this process, all aspects relevant to the characteristics of the chosen biomarkers are fully discussed.55 Apart from regulations that ensure quality in the clinical laboratory testing system, once accurate genomic data have been obtained, the question of which results to return becomes an VOLUME 99 NUMBER 2 | FEBRUARY 2016 | www.wileyonlinelibrary/cpt
REVIEWS
Figure 2 Estimated cost of WGS in the clinical setting, including, in addition to sequencing costs, informed consent, interpretation, bioinformatics, confirmatory testing, and data storage. Cost of sequencing is a very small proportion of the true cost of using WGS in the clinical setting.
issue from the policy and ethical standpoint. The biggest challenges relate to how to handle unexpected findings56 from WGS/ WES datasets and how to best manage the related data security, privacy, and genetic discrimination risks.57 This is currently a very controversial area, and many articles have been written on the proper disclosure of WGS incidental findings.58 Some have spoken of an “ethical duty” to return all results, when “validity, significance and benefit”59 have been proven. An example of this type of situation may be the unanticipated discovery of a BRCA1/2 mutation, which will be picked up in WES/WGS and will have an impact on the patient as well as implications for their family. However, there can be unintended effects of incidental findings, including improper treatment decisions and psychological and/or physical harm; these concerns have prompted recent guidelines.60 Moreover, variations that are currently of unknown significance could become significant in the future. Thus, the individual risk profile of a patient will potentially change over time, further complicating how this information needs to be shared, managed, and updated. When ready access to genome sequencing becomes available, either clinically or even via direct-to-consumer models, much of the ethical tension for pharmaceutical companies to disclose researchbased results may be resolved, as patients would have a straightforward path to personal genome sequencing results in the more ideal clinical care setting. It may thus be argued, from the pharmaceutical company perspective, that the industry ought not to take on a leadership role in providing incidental genomic results to patients when health professionals have yet to reach consensus on which results should be disclosed under various circumstances.61 Given the theoretical risk of unintended harm from clinically misinterpreted or analytically invalid pharma-initiated delivery of incidental or investigative genomic test results, as well as the uncertain benefit and actionability of valid results, results that were not ordered with informed consent and performed in a CLIA-certified lab (or nonUS equivalently certified lab) should not be returned. It should be noted that for transferring “know-how” using genomic data there is a need to take into account different per-
spectives on privacy in the different jurisdictions governing health data. In this respect the EMA provided early guidance in the reflection paper62 on some aspects of data collection and handling to support the use of genomic data recommended to be generated in the life-cycle management of medicines development.63,64 This brings us to how LUS data should be stored over time. A first question concerns whose responsibility it is to store the data, whether the physician, the patient, or the service that generated it. While it may be more cost-effective to manage the large datasets generated by LUS internally to an institution, the most efficient way may be through cloud-based resources,65 which make it easier to share data with others and smoothly scale up infrastructure. Unfortunately, these also have the highest risk of compromising protected health information. Protected health information managed in the cloud environment should be deidentified and encrypted in transit and storage.66 Patient identifiable data should be protected, ideally in a manner that does not inhibit biomarker discovery or clinical trials research. Patients and research participants who have their genome, exome, or transcriptome sequenced for any reason should sign a consent that describes their decisions regarding what data will be returned to them, and how and what data should not be returned to them, provides a mechanism to use the data for research purposes, and reassures individuals that their data will be handled with confidentiality. In addition to country-specific restrictions on how these data are used for clinical care, perhaps what remains is “incidental data,” or the concept of how researchers can gain access to data generated for clinical diagnosis to improve clinical trials, develop analytic tools, identify diagnostic, therapeutic, and prognostic biomarkers, and improve sequencing technology itself. From the commercial standpoint, the biggest hurdle is probably the existing uncertainty on the extent to which payers will reimburse for expensive diagnostic approaches, even when selection improves patient outcome.67 Return on investment, whether it is by pharmaceutical companies, diagnostic companies, patients, or payers is another challenge, considering the implementation
CLINICAL PHARMACOLOGY & THERAPEUTICS | VOLUME 99 NUMBER 2 | FEBRUARY 2016
203
REVIEWS and deployment cost of these technologies. The actual cost of the “$1,000 genome” (Figure 2), even when instrumentation is accessible, may approach $14,000 to $22,000 once informed consent, bioinformatics, interpretation, confirmatory testing, genetic counseling, and data storage are factored in.68 Economic analyses have been conducted for companion diagnostics, including the main factors differentiating a precision medicine from an allcomers strategy.69 For drug development, genetic markers would ideally result in smaller and faster trials and accelerated approvals, although extra studies may be needed to establish markerresponse correlations, with potential adverse effects on patient recruitment and development times. LUS would have the added benefit of having patients already screened, which could facilitate clinical trial enrollment; such benefits should be considered when calculating the overall return on investment for LUS, provided additional costs related to instrumentation and analysis are taken into account (this is a crucial aspect of ultimate feasibility of LUS). While some economic evaluation of LUS in a clinical setting has been attempted,70 more studies will undoubtedly follow to clarify its use in clinical practice. As a final consideration, ensuring accurate patient screening and recouping the costs of research and development (R&D) of a companion diagnostic presents another challenge. OPPORTUNITIES AND NEEDS
Some in the scientific community believe that NGS can substantially replace existing genetic diagnostics,71 with other LUS platforms providing even more opportunities. The current regulatory paradigm of “one-biomarker, one-test” makes obtaining approval for drugs targeting rare mutations costly and time-consuming, and is not easily extended to LUS. Most current tests focus on either the presence or absence of a given mutation (be it single nucleotide variant (SNV), insertion/deletion (INDEL), or copy-number variation (CNV), germline or somatic) or rearrangement—this binary information is very different from the granular, detailed LUS datasets. There is also a potential to follow up test results with additional tests and procedures. A clear standard or even a unified approach to a system that ensures equivalence and portability of results has not emerged to date (http://www.mckinsey.com/~/media/mckinsey/dotcom/client_service/pharma and medical products/pmp new/ pdfs/mckinsey on personalized medicine march 2013.ashx). In a way, LUS is an opportunity to simplify diagnostics. We propose that, ideally, LUS and the information it produces should be designed and evaluated as a “system”—as opposed to fragmenting the design of each LUS diagnostic (and thus, the approval landscape) as individual products. The rationale for this position is clear—the analyte (e.g., DNA sequence) serves as a simple language regardless of the gene product it encodes. It is possible to demonstrate accuracy across nearly all sequences, and thus it is possible to evaluate DNA sequencing based on validity of results in reading out this sequence accurately. This would allow a systematic regulatory review of the accuracy of the system (or primary system with a confirmatory method), such that the system could be approved as a whole and the information derived from the system that applies to genetic information (rather than 204
technical aspects of the system) could then be the individual regulated quanta.72 Such an approach would involve the validation of the NGS system (e.g., with a confirmatory technology) as providing accurate DNA sequence information based on a broad validation exercise using a selected number of known sequences. This could then become a single regulatory entity, and disclosure of the information provided by that system could then be regulated through labeling amendments. In this manner, for example, when a drug is approved that will address an additional CFTR genotype, or if additional mutations in an oncogene are found to be clinically relevant, this could be added to the approved labeling for disclosure without additional need to create a new regulated entity or conduct additional analytical review. Alternatively, the information could simply be added to the approved drug labeling, although this could prove cumbersome as multiple approved mutations accumulate. The US FDA is discussing similar concepts for the regulation of NGS as part of President Obama’s Precision Medicine Initiative. In terms of other needs, it is difficult to describe disease without the context of normal. A database of genomic variation that is not only statistically accurate with regard to the frequency of normal minor allele variation in the genome, but does so in the context of ancestral information is essential. A second resource that is necessary for accurate interpretation of genomic information is a reference database that contains information on the frequency of disease-causing variants in ancestral populations. Finally, a set of reference genomes that are based on ancestry, or a dynamic pan-reference genome capable of being used regardless of the ancestry of the individual being sequenced, is necessary to properly instruct the assembly of sequence information. This can either be done while the sequence is being generated or through post-sequencing de novo assembly of sequence information. LUS’ increasing knowledge base is dynamic over time. This leaves open whose responsibility it is to reprocess LUS datasets with new algorithms and the associated reanalysis costs. An approach would be a tiered reporting system, where information can be categorized as either actionable or not, depending on the state of the treatment and the associated science. This would not limit access to information, but rather would label it according to its expected use at a specific point in time.60 Alternatively, instead of reporting a full set of results from LUS, one could imagine reporting a limited number of variants or genes, such as those of interest to the treating clinician,73 simply by applying an “informatics mask” to the data, which filters out all data except those of interest to the specific clinical question. The potential advantage is that if a new gene becomes of interest, one would not generate a new assay, but rather modify, validate, and deploy a new mask to be applied to the same dataset. An alternative to filtering out some potentially informative data is to recognize that there will be clinical claims associated with some biomarkers, with the rest being analytical claims. For LUS data, the clinical utility (and the attendant risk) may not arise until a therapy is available for a given genetic variant/mutation, i.e., the data becomes actionable information. Before then (even in later-stage research, such as a pivotal clinical trial), those data could be VOLUME 99 NUMBER 2 | FEBRUARY 2016 | www.wileyonlinelibrary/cpt
REVIEWS
Figure 3 The landscape of entities interested in the use of LUS as a clinical diagnostic.
considered exploratory research, and handled according to the informed consent subscribed to by the patient. Open collaboration opportunities between pharmaceutical companies and other players, including LUS companies, can maximize data availability (Figure 3). Another driver for collaboration is the development of novel bioinformatics solutions, crucial for handling and interpreting what is rapidly becoming an unprecedented amount of complexity. If challenges related to informed consent, data sharing, and platform harmonization could be met, NGS data mining could provide rich substrates to pharmaceutical companies for target selection. However, the current testing environment can be very heterogeneous (as with individual LDTs, there are heterogeneities between different LUS platforms and heterogeneity of the same LUS platform at different sites, as mentioned). CONCLUSION
The importance of delivering the right drug to each patient is unambiguously clear. LUS data still suffer from technical issues such as incomplete genome coverage and disagreements about clinically reportable and therapeutically actionable findings vs. those relating to variants of unknown significance.33 LUS for discovery of exploratory biomarkers and clinical diagnostics requires a level of precision that is currently technically difficult to achieve, leading researchers to focus on more targeted sequencing approaches. With the rapid emergence of deep genotyping and phenotyping approaches, including next-generation sequencing alongside other high-throughput omics, imaging and electronic biomarker and other clinical data (e.g., electronic health records) increasingly collected in clinical care settings, more complex genotype vs. phenotype correlations are arising. Continuing research is needed to clarify the circumstances under which validated LUS findings associated with these complex correlations are causal, clinically reportable, and actionable when therapeutic options are considered. However, the markedly improved sensitivity of LUS already presents an unprecedented opportunity for investigation
of rare phenotypes and “N-of-1” extreme responders in drug trials,74 to better understand the genetic basis of disease and therapeutic intervention, respectively. There is no question that a strong scientific rationale exists for more widespread use of LUS in the clinic, but we agree with the mindset to “proceed with caution.”75 LUS will generate a much richer and more flexible set of human genetic data than other technologies, such as SNP-based genotyping, transcript profiling, and multigene panel-based targeted sequencing. If some of the technical challenges outlined in this article are successfully met, these richer genetic datasets will likely enable collaborative drug discovery and development in unique ways, not only through population-level data on genetic variant frequency, correlation to phenotype and outcome for patient stratification and disease understanding, and alignment with electronic health records (http://www.abpi.org.uk/our-work/ library/medical-disease/Pages/personalised-medicines.aspx), but also by uncovering involvement of unexpected genes or pathways and, especially, difficult to anticipate complex network effects. It is our opinion that policy formulated at this particular time of rapid technological change and accretion of knowledge will present challenges in this area of development. Policy statements should include some degree of flexibility, since the rapid technological evolution the field is witnessing could make specific recommendations short-lived. A public–private partnership model addressing practical limitations of LUS and engaging various stakeholders, including pharmaceutical companies, technology providers, patient organizations, and other interested parties, would likely maximize information sharing, increase utility of data, and ensure long-term applicability to the clinic. Our recommendation is to devote significant effort to the establishment of process and reference standards to facilitate interpretation of variants in individual LUS datasets and unambiguous reporting of findings, provided these are obtained in accordance with good clinical practice. We conclude that a
CLINICAL PHARMACOLOGY & THERAPEUTICS | VOLUME 99 NUMBER 2 | FEBRUARY 2016
205
REVIEWS cautious, but proactive, approach is critical for the administration and handling of LUS tests and diagnostics, a rapidly evolving field still characterized by the rapid obsolescence of new technology. ACKNOWLEDGMENTS Current address for P Vicini: MedImmune, Cambridge, UK; Current address for A-M Martin: Adaptimmune LLC, Philadelphia, PA, USA; Current address for T Zaks: Moderna Therapeutics, Cambridge, MA, USA; Current address for M Robson: Bristol-Myers Squibb, Princeton, NJ, USA. The lead authors thank numerous colleagues for providing their insight and expertise during the writing of this article: Diane Louie, JeanClaude Marshall, William Mounts, Paul Rejto, Susan Stephens, David von Schack, and Jeffrey Florian.
DISCLAIMER The views and opinions expressed in this article are the personal views and opinions of the authors. These views and opinions do not necessarily reflect the official policy or position of any agency of the U.S. FDA or of the European Medicines Agency. The definitions and terms listed in this article do not replace or modify any existing regulations or regulatory language. They are provided as an aid to understanding some of the more complex regulatory concepts related to this topic. These definitions concern generalities only and may not apply to any specific device or situation. One should always consult a regulatory specialist or the FDA for specifics regarding any device intended for marketing or for which you seek regulatory information. CONFLICT OF INTEREST The authors declare no conflicts of interest. C 2015 American Society for Clinical Pharmacology and Therapeutics V
1.
2. 3.
4.
5.
6. 7. 8.
9.
10.
11. 12. 13.
206
Hood, L. & Flores, M. A personal view on systems medicine and the emergence of proactive P4 medicine: predictive, preventive, personalized and participatory. New Biotechnol. 29, 613–624 (2012). President’s Council of Advisors on Science and Technology. Priorities for Personalized Medicine. (2008). Trusheim, M.R., Berndt, E.R. & Douglas, F.L. Stratified medicine: strategic and economic implications of combining drugs and clinical biomarkers. Nat. Rev. Drug Discov. 6, 287–293 (2007). Momper, J.D. & Wagner, J.A. Therapeutic Drug Monitoring as a Component of Personalized Medicine: Applications in Pediatric Drug Development. Clin. Pharmacol. Ther. 95, 138–140 (2014). Muller, P.Y. & Milton, M.N. The determination and interpretation of the therapeutic index in drug development. Nat. Rev. Drug Discov. 11, 751–761 (2012). Walker, I. & Newell, H. Do molecularly targeted agents in oncology have reduced attrition rates? Nat. Rev. Drug Discov. 8, 15–16 (2009). Ramsey, B.W. et al. A CFTR potentiator in patients with cystic fibrosis and the G551D mutation. N. Engl. J. Med. 365, 1663–1672 (2011). Dolsten, M. & Sogaard, M. Precision medicine: an approach to R&D for delivering superior medicines to patients. Clin. Transl. Med. 1, 7 (2012). Scannell, J.W., Blanckley, A., Boldon, H. & Warrington, B. Diagnosing the decline in pharmaceutical R&D efficiency. Nat. Rev. Drug Discov. 11, 191–200 (2012). Modur, V., Hailman, E. & Barrett, J.C. Evidence-based laboratory medicine in oncology drug development: from biomarkers to diagnostics. Clin. Chem. 59, 102–109 (2013). Mologni, L. Inhibitors of the anaplastic lymphoma kinase. Expert Opin. Investig. Drugs 21, 985–994 (2012). Flaherty, K.T., Yasothan, U. & Kirkpatrick, P. Vemurafenib. Nat. Rev. Drug Discov. 10, 811–812 (2011). Gazdar, A.F. Personalized medicine and inhibition of EGFR signaling in lung cancer. N. Engl. J. Med. 361, 1018–1020 (2009).
14. O’Donnell, P.H. et al. The 1200 Patients Project: creating a new medical model system for clinical implementation of pharmacogenomics. Clin. Pharmacol. Ther. 92, 446–449 (2012). 15. Pulley, J.M. et al. Operational implementation of prospective genotyping for personalized medicine: the design of the Vanderbilt PREDICT Project. Clin. Pharmacol. Ther. 92, 87–95 (2012). 16. Sims, D., Sudbery, I., Ilott, N.E., Heger, A. & Ponting, C.P. Sequencing depth and coverage: key considerations in genomic analyses. Nat. Rev. Genet. 15, 121–132 (2014). 17. The Cancer Genome Atlas Research Network. The Cancer Genome Atlas Pan-Cancer analysis project. Nat. Genet. 45, 1113–1120 (2013). 18. The 1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012). 19. Haga, S.B. 100k Genome Project: sequencing and much more. Personal. Med. 10, 761–764 (2013). 20. Schizophrenia Working Group of the Psychiatric Genomics Consortium. Biological insights from 108 schizophrenia-associated genetic loci. Nature 511, 421–427 (2014). 21. MacArthur, D.G. et al. A systematic survey of loss-of-function variants in human protein-coding genes. Science 335, 823–828 (2012). 22. Creighton, C.J., Reid, J.G. & Gunaratne, P.H. Expression profiling of microRNAs by deep sequencing. Brief. Bioinform. 10, 490–497 (2009). 23. Van Allen, E.M. et al. Whole-exome sequencing and clinical interpretation of formalin-fixed, paraffin-embedded tumor samples to guide precision cancer medicine. Nat. Med. 20, 682–688 (2014). 24. Welch, J.S. et al. Use of whole-genome sequencing to diagnose a cryptic fusion oncogene. JAMA 305, 1577–1584 (2011). 25. Worthey, E.A. et al. Making a definitive diagnosis: successful clinical application of whole exome sequencing in a child with intractable inflammatory bowel disease. Genet. Med. 13, 255–262 (2011). 26. Bainbridge, M.N. et al. Whole-genome sequencing for optimized patient management. Sci. Transl. Med. 3, 87re3 (2011). 27. Choi, M. et al. Genetic diagnosis by whole exome capture and massively parallel DNA sequencing. Proc. Natl. Acad. Sci. U. S. A. 106, 19096–19101 (2009). rin, V. et al. Translation of genomics-guided RNA-based 28. Boisgue personalised cancer vaccines: towards the bedside. Br. J. Cancer 111, 1469–1475 (2014). 29. Wilson, M.R. et al. Actionable diagnosis of neuroleptospirosis by next-generation sequencing. N. Engl. J. Med. 370, 2408–2417 (2014). 30. Saunders, C.J. et al. Rapid whole-genome sequencing for genetic disease diagnosis in neonatal intensive care units. Sci. Transl. Med. 4, 154ra135 (2012). 31. Kishnani, P.S. et al. Early treatment with alglucosidase alfa prolongs long-term survival of infants with Pompe disease. Pediatr. Res. 66, 329–335 (2009). 32. Lunshof, J.E. Whole genomes, small children, big questions. Personal. Med. 9, 667–669 (2012). 33. Dewey, F.E. et al. Clinical interpretation and implications of wholegenome sequencing. JAMA 311, 1035–1045 (2014). 34. Solomon, B.D. Obstacles and opportunities for the future of genomic medicine. Mol. Genet. Genomic Med. 2, 205–209 (2014). 35. Bodian, D.L. et al. Germline variation in cancer-susceptibility genes in a healthy, ancestrally diverse cohort: implications for individual genome sequencing. PLoS ONE 9, e94554 (2014). 36. Jorgenson, E. & Witte, J.S. A gene-centric approach to genome-wide association studies. Nat. Rev. Genet. 7, 885–891 (2006). 37. Badiee, A., Eiken, H., Steen, V. & Lovlie, R. Evaluation of five different cDNA labeling methods for microarrays using spike controls. BMC Biotechnol. 3, 23 (2003). 38. Lam, H.Y.K. et al. Performance comparison of whole-genome sequencing platforms. Nat. Biotech. 30, 78–82 (2012). 39. Zook, J.M. et al. Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls. Nat. Biotech. 32, 246–251 (2014). 40. Rehm, H.L. et al. ACMG clinical laboratory standards for nextgeneration sequencing. Genet. Med. 15, 733–747 (2013). 41. Gullapalli, R.R., Desai, K.V., Santana-Santos, L., Kant, J.A. & Becich, M.J. Next generation sequencing in clinical medicine: challenges and lessons for pathology and biomedical informatics. J. Pathol. Inform. 3, 40 (2012). VOLUME 99 NUMBER 2 | FEBRUARY 2016 | www.wileyonlinelibrary/cpt
REVIEWS 42. Miller, J.R., Koren, S. & Sutton, G. Assembly algorithms for nextgeneration sequencing data. Genomics 95, 315–327 (2010). 43. Ding, L., Wendl, M.C., McMichael, J.F. & Raphael, B.J. Expanding the computational toolbox for mining cancer genomes. Nat. Rev. Genet. 15, 556–570 (2014). 44. O’Rawe, J. et al. Low concordance of multiple variant-calling pipelines: practical implications for exome and genome sequencing. Genome Med. 5, 1–18 (2013). 45. Bodian, D.L. et al. Diagnosis of an imprinted-gene syndrome by a novel bioinformatics analysis of whole-genome sequences from a family trio. Mol. Genet. Genomic Med. 2, 530–538 (2014). 46. Biesecker, L.G. & Green, R.C. Diagnostic clinical genome and exome sequencing. N. Engl. J. Med. 370, 2418–2425 (2014). 47. Rosenfeld, J.A., Mason, C.E. & Smith, T.M. Limitations of the Human Reference Genome for personalized genomics. PLoS ONE 7, e40294 (2012). 48. Rehm, H.L. et al. ClinGen—the clinical genome resource. N. Engl. J. Med. 372, 2235–2242 (2015). 49. Evans, B.J., Burke, W. & Jarvik, G.P. The FDA and genomic tests— getting regulation right. N. Engl. J. Med. 372, 2258–2264 (2015). 50. Parkinson, D.R., Johnson, B.E. & Sledge, G.W. Making personalized cancer medicine a reality: challenges and opportunities in the development of biomarkers and companion diagnostics. Clin. Cancer Res. 18, 619–624 (2012). 51. European Parliament. Regulation (Ec) No 726/2004 of the European Parliament and of The Council of 31 March 2004 laying down community procedures for the authorisation and supervision of medicinal products for human and veterinary use and establishing a European Medicines Agency. (2004). 52. Pignatti, F. et al. Cancer drug development and the evolving regulatory framework for companion diagnostics in the European Union. Clin. Cancer Res. 20, 1458–1468 (2014). 53. Ehmann, F., Caneva, L. & Papaluca, M. European Medicines Agency initiatives and perspectives on pharmacogenomics. Br. J. Clin. Pharmacol. 77, 612–617 (2014). 54. European Medicines Agency. EMA scientific guidance documents on pharmacogenomics. (2015). 55. European Medicines Agency. Qualification of novel methodologies for medicine development. (2015). 56. Wolf, S.M. et al. Managing incidental findings in human subjects research: analysis and recommendations. J. Law Med. Ethics 36, 219–248 (2008). 57. Yang, Y. et al. Clinical whole-exome sequencing for the diagnosis of Mendelian disorders. N. Engl. J. Med. 369, 1502– 1511 (2013). 58. Burke, W. et al. Recommendations for returning genomic incidental findings? We need to talk! Genet. Med. 15, 854–859 (2013).
59. Knoppers, B.M., Joly, Y., Simard, J. & Durocher, F. The emergence of an ethical duty to disclose genetic research results: international perspectives. Eur. J. Hum. Genet. 14, 1170–1178 (2006). 60. Green, R.C. et al. ACMG recommendations for reporting of incidental findings in clinical exome and genome sequencing. Genet. Med. 15, 565–574 (2013). 61. Parkman, A. et al. Public awareness of genetic nondiscrimination laws in four states and perceived importance of life insurance protections. J. Genet. Counsel. 1–10 (2014). 62. European Medicines Agency. Reflection paper on pharmacogenomic samples, testing and data handling. (2007). 63. European Medicines Agency. Guideline on the use of pharmacogenetic methodologies in the pharmacokinetic evaluation of medicinal products. (2015). 64. European Medicines Agency. Draft guideline on key aspects for the use of pharmacogenomic methodologies in the pharmacovigilance evaluation of medicinal products. (2015). 65. Liu, B. et al. Cloud-based bioinformatics workflow platform for largescale next-generation sequencing analyses. J. Biomed. Inform. 49, 119–133 (2014). ndez, G. & Lo pez-Coronado, M. 66. Rodrigues, J.J., de la Torre, I., Ferna Analysis of the security and privacy requirements of cloud-based electronic health records systems. J. Med. Internet Res. 15, e186 (2013). 67. Trusheim, M.R. & Berndt, E.R. Economic challenges and possible policy actions to advance stratified medicine. Person. Med. 9, 413– 427 (2012). 68. Chrystoja, C.C. & Diamandis, E.P. Whole genome sequencing as a diagnostic test: challenges and opportunities. Clin. Chem. 60, 724– 733 (2014). 69. Trusheim, M.R. et al. Quantifying factors for the success of stratified medicine. Nat. Rev. Drug Discov. 10, 817–833 (2011). 70. Shashi, V. et al. The utility of the traditional medical genetics diagnostic evaluation in the context of next-generation sequencing for undiagnosed genetic disorders. Genet. Med. 16, 176–182 (2014). 71. Desai, A.N. & Jere, A. Next-generation sequencing: ready for the clinics? Clin. Genet. 81, 503–510 (2012). 72. Lander, E.S. Cutting the Gordian helix—regulating genomic testing in the era of precision medicine. N. Engl. J. Med. 372, 1185–1186 (2015). 73. Kruglyak, K.M., Lin, E. & Ong, F.S. Next-generation sequencing in precision oncology: challenges and opportunities. Expert Rev. Mol. Diagn. 14, 635–637 (2014). 74. Brannon, A.R. & Sawyers, C.L. “N of 1” case reports in the era of whole-genome sequencing. J. Clin. Investig. 123, 4568–4570 (2013). 75. Feero, W. Clinical application of whole-genome sequencing: proceed with care. JAMA 311, 1017–1019 (2014).
CLINICAL PHARMACOLOGY & THERAPEUTICS | VOLUME 99 NUMBER 2 | FEBRUARY 2016
207