Computational methods for early predictive safety ...

3 downloads 10792 Views 889KB Size Report
Expert opinion: A significant obstacle for data-driven safety assessment is the ...... vertical bars delimit study phases (pretest; test; recovery period, if any).
Review

Computational methods for early predictive safety assessment from biological and chemical data 1.

Introduction

2.

Computational methods for early predictive safety assessment

Expert Opin. Drug Metab. Toxicol. Downloaded from informahealthcare.com by Novartis Pharma on 01/20/15 For personal use only.

3.

Expert opinion

Florian Nigsch†, Eugen Lounkine, Patrick McCarren, Ben Cornett, Meir Glick, Kamal Azzaoui, Laszlo Urban, Philippe Marc, Arne Mu¨ller, Florian Hahne, David J Heard & Jeremy L Jenkins †

Novartis Institutes for BioMedical Research, Chemical Biology Informatics, Quantitative Biology, Developmental and Molecular Pathways, Novartis Campus Basel, CH-4056 Basel, Switzerland

Introduction: The goal of early predictive safety assessment (PSA) is to keep compounds with detectable liabilities from progressing further in the pipeline. Such compounds jeopardize the core of pharmaceutical research and development and limit the timely delivery of innovative therapeutics to the patient. Computational methods are increasingly used to help understand observed data, generate new testable hypotheses of relevance to safety pharmacology, and supplement and replace costly and time-consuming experimental procedures. Areas covered: The authors survey methods operating on different scales of both physical extension and complexity. After discussing methods used to predict liabilities associated with structures of individual compounds, the article reviews the use of adverse event data and safety profiling panels. Finally, the authors examine the complexities of toxicology data from animal experiments and how these data can be mined. Expert opinion: A significant obstacle for data-driven safety assessment is the absence of integrated data sets due to a lack of sharing of data and of using standard ontologies for data relevant to safety assessment. Informed decisions to derive focused sets of compounds can help to avoid compound liabilities in screening campaigns, and improved hit assessment of such campaigns can benefit the early termination of undesirable compounds. Keywords: bioinformatics, cheminformatics, computational toxicology, predictive safety assessment Expert Opin. Drug Metab. Toxicol. (2011) 7(12):1497-1511

1.

Introduction

The ultimate goal of predictive safety assessment (PSA) is to protect patients from adverse drug reactions (ADRs). The number of patients taking pharmaceutical products is growing with the aging population in industrialized countries and with improving health care in the developing world. In the US alone, more than 47% of the population take at least one prescription medication a month. The percentage taking at least three medications per month has been steadily increasing across all age groups; for example, in children this has doubled to 10% in the past decade [1]. Acute toxicity due to ADRs has been estimated to result in up to 15% of hospital admissions [2] and as many as 100,000 deaths per year in US hospitals [3]. Many toxicities are discovered only in human trials or following approval because earlier studies are done in surrogate organisms, mostly in rodent and various nonrodent species, or in small well-controlled human studies in defined patient groups. Thus, identifying, in advance, problems that may be specific to humans, or specific subpopulations, is not only more valuable but also more difficult. Emerging markets 10.1517/17425255.2011.632632 © 2011 Informa UK, Ltd. ISSN 1742-5255 All rights reserved: reproduction in whole or in part not permitted

1497

Computational methods for early predictive safety assessment from biological and chemical data

Article highlights. . .

.

Expert Opin. Drug Metab. Toxicol. Downloaded from informahealthcare.com by Novartis Pharma on 01/20/15 For personal use only.

.

Predictive safety assessment (PSA) is a crucial activity in the drug discovery and development process. The earlier potential compound liabilities can be detected, the earlier they can be addressed. As a result, safer medicines can be delivered to patients at greater speed and reduced cost. There are a number of methods and data sources that are used to inform the decision-making processes involved in PSA. The range of methods is as heterogeneous as the data employed, ranging from detailed calculations on single compounds to determine their reactivity toward other molecular species, to the use of adverse drug reaction data and safety profiling panels, to the warehousing and systematic mining of data from animal studies.

This box summarizes key points contained in the article.

such as China, India, Russia, Brazil, Mexico, Indonesia and Turkey have a combined population of approximately 3.2 billion people [4]. The increase in potential users of prescription drugs entails an increase in genetic differences, some of which affect drug metabolism and efficacy [5]. Increased diversity in the genetic pool also increases the risk of idiosyncratic ADRs. Drug toxicity continues to be a critical factor for drug discovery and development and accounted for more than 20% of attrition in clinical studies in 2000 [6]. The financial burden of late-stage compound attrition can reach hundreds of millions of dollars [7]. It is much less costly to fail in earlier stages of the discovery and development pipeline. PSA tries to address this need of identifying compound liabilities early on to help in the decision-making processes involved in progressing further or de-prioritizing individual compounds and to alert project teams about potential future problems with a compound or series of compounds. PSA is incorporated at several stages and provides a variety of data to inform the drug discovery and development process. In early discovery, in silico methods are routinely applied to prioritize compounds or compound series, or to flag them according to certain potential liabilities (e.g., reactive groups known to bind to DNA such as aldehydes) [8-11]. In intermediate stages, in vitro assays such as binding assays (hERG) or Ames tests are used to filter out compounds [12-18]. Following lead selection, whole-animal experiments are performed in various species that combine pharmacokinetics/pharmacodynamics, biomarkers (aspartate aminotransferase/alanine aminotransferase, blood pressure), toxicology (body weight, organ weight, in-life observations), gene expression data, hematology and microscopic histopathology readouts to reveal toxic endpoints resulting from compound administration [19-21]. While the application of computational filters or in vitro assays is inexpensive and rapid, animal experiments are costly and time consuming, ranging from 14 days to 2 years [19-22]. In vivo experiments, which are required by the regulatory 1498

agencies, provide complex information on exposure, efficacy and toxicity which are more relevant for clinical performance than what can be gleaned from an in silico prediction of a potentially incompatible structural feature. This implies that in silico predictions based, for example, on links between structural features and clinical ADRs should not support compound termination but trigger safety assessment by in vitro and in vivo profiling and, once proven, guide mitigation. The true value of the predictive approach is that it allows more freedom for chemists at the lead selection and optimization phases to make structural modifications that limit the risk of unacceptable toxicity early in the pipeline, rather than following more costly animal or human studies, and lead to greater safety for patients. Several examples of successful computational approaches have been published, such as in silico-driven mitigation of promiscuity and activity at well known off-targets, that provide evidence for the impact of PSA [23,24]. Computational predictive safety encompasses the application of computational techniques to data that stem from experiments probing toxic effects of molecules. This includes the application of algorithms to generate new knowledge from existing data as well as the annotation, storage and retrieval of data (Figure 1). The existence of large amounts of safety data covering a large number of diverse molecules is essential to allow for the creation of reliable computational models, especially to assess new chemical entities. The data themselves come from many different sources including binding assays, in vitro and in vivo experiments, preclinical and clinical safety information, and post-marketing safety data gathering (pharmacovigilance). The diversity of data is reflected in the diversity of computational techniques that are applied to such data. Figure 1 illustrates the iterative development cycle of computational models for predictive safety, starting from the integration of disparate and heterogeneous data obtained in biochemical, preclinical or clinical studies to the conception of algorithms and models that then can guide the design of further experiments. In the following, we cover a range of different techniques of relevance to computational PSA. We explore aspects of predictive safety that range from detailed calculations on individual small molecules to the profiling of compounds across safety panels to whole-animal experiments. Each of these techniques has its set of limitations and challenges. Their involvement in the process of PSA, however, is justified through their scientific contributions that lead to more informed decisions about the fate of single compounds or entire classes of compounds.

Computational methods for early predictive safety assessment

2.

Computational methods for early PSA are of considerable value if they are used in the right way at the right time. This requires the understanding of inherent limitations and assumptions in the methods and models used. Any risk

Expert Opin. Drug Metab. Toxicol. (2011) 7(12)

Nigsch, Lounkine, McCarren, et al.

Experimental data/observations Pharmacokinetics

Clinical trials

Pharmacodynamics Post-marketing data Preclinical toxicity Computational approaches

O S O O

Expert systems

Expert Opin. Drug Metab. Toxicol. Downloaded from informahealthcare.com by Novartis Pharma on 01/20/15 For personal use only.

O

Ligand structure

Quantum mechanics Data integration and federation

QSAR Statistical models Structural-based models

Target structure

Figure 1. Overview of computational methods of use for predictive safety assessment. QSAR: quantitative structure--activity relationship.

associated with a particular compound is always a function of hazard and exposure. In line with the saying ‘the dose makes the poison,’ it is important to consider predictions from computational models in that context: such predictions may not reflect toxicities observable in vivo due to limitations in the models that stem from lack of data or coverage of the chemical space, as well as pharmacodynamic and pharmacokinetic (absorption, distribution, metabolism and excretion properties of the compound) considerations (e.g., low blood--brain barrier permeability is unlikely to lead to central nervous system (CNS) effects). Prediction of toxicological endpoints for small molecules

2.1

A number of theoretical methods have been developed to correlate the toxicities of small molecules with their chemical structures or to propose a mechanism of toxicity and prioritize testing of compounds of concern. These methods typically fall into the following categories: i) expert or rule-based systems; ii) statistical models; iii) quantum mechanics calculations; iv) structure-based approaches; and v) use of safety panels and the data generated. A crucial component for the development of predictive methods is their assessment in terms of sensitivity, specificity and overall predictivity [25]. Expert or rule-based systems are based on data of historical associations of toxicological endpoints with certain substructural features. These relatively straightforward methods are reasonably accurate, accessible to experts and nonexperts alike, and provide a first-line approach to toxicity evaluation. Examples for rule-based systems are DEREK [9,26] and

ToxTree, which include the rulebases of Ashby and Benigni [27]. The rulebase contains substructures known to be associated with specific toxicities, for example, nitroaromatic groups causing genotoxicity. These methods can be effective as first screens with correct prediction of genotoxic and mutagenic compounds of at least 73 and 62% respectively [28]; however, validations of these methods using external sets for genotoxicity prediction have noted problems with sensitivity leading to a detection of mutagenicity in only 52% of cases [28-30]. One of the reasons for this low accuracy is to be found in the inherent inability of rule-based models to extrapolate to new chemotypes or to compounds that contain relatively few of the ruleencoded structural features [31]. Furthermore, these methods do not take into account reactive metabolites that may be formed. Statistical classification and regression models are the most used approaches for predicting the toxicity of small molecules. This has been an active area of research in recent years, to a certain degree limited by the availability and quality of appropriate data. These models are also employed to predict from structural features and intrinsic properties whether a compound might be toxic, or how toxic it might be. For example, MultiCASE [32] is a method that fragments molecules and determines fragments that are associated with different toxicity classes [33]. Reactive intermediates are frequently causative of molecular toxicity, and quantum mechanics methods have the unique ability to assess and quantify the reactivity or stability of such intermediates. Semi-empirical quantum mechanics calculations, such as AM1 [34] or PM3 [35], are incorporated

Expert Opin. Drug Metab. Toxicol. (2011) 7(12)

1499

Expert Opin. Drug Metab. Toxicol. Downloaded from informahealthcare.com by Novartis Pharma on 01/20/15 For personal use only.

Computational methods for early predictive safety assessment from biological and chemical data

into models for genotoxicity [36] and are also used for other endpoints associated with reactivity such as immunogenic reactions from protein adducts. These calculations are parameterized methods that calculate reaction energies and orbital energies. The parameterization allows the realization of compute times of several minutes per molecule, as opposed to hours without parameterization. The energies of the highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO) are routinely used as parameters; however, these orbital energies do not take into account the optimized structure of the resulting ion nor the energy of any reactions involved. The former correlates with the ionization potential, which provides an estimate of how energetically unfavorable the loss of an electron will be in the generation of a cationic reactive intermediate, whereas the latter correlates with the electron affinity, that is, the relative stability on acceptance of an electron or propensity to react with an electron-rich substrate [37,38]. The difference between HOMO and LUMO determines the reactivity of a compound and has been correlated to the phototoxicity of compounds [39,40]. In conjunction with structural alerts and UV/ VIS spectra, the HOMO/LUMO gap can inform decisions with respect to phototoxicity assays. A potential risk assessment schema is presented in Figure 2: In the case that a molecule contains a potentially reactive substructure and the HOMO/LUMO gap is between 6.7 and 7.7 eV, a UV spectra determination is triggered. A high extinction coefficient then triggers a murine local lymph node assay if a dermal application is planned. Otherwise, the 3T3 NRU phototoxicity test can be applied for further investigation. However, in the experience of the authors, those predictions have limited practical value. The HOMO/LUMO difference has also been applied with varying success to the study of other toxicities related to reactivity [36,41]. Increases in computing power and more advanced computational techniques have permitted the use of accurate quantum mechanics methods on larger scales to determine reaction mechanisms. Density functional theory calculations are the most widely used quantum mechanics methods for this purpose and offer a balance of accuracy and speed [42,43]. In addition to the increased feasibility of these calculations, other developments offer great promise in predicting toxicological endpoints. For example, long-range interactions pose a challenge to quantum mechanics methods but new density functionals and inclusion of classical long-range potentials can improve the description of intercalation for genotoxicity, halogen-bonding for cytochrome interaction or association complexes [30,44]. Structural knowledge of protein targets associated with toxicity, for example, metabolic enzymes, cardiac ion channels or DNA, allows the study of the interaction of a small molecule with its binding partner. Such approaches have led to the development of structure-based approaches to predicting toxicity. For example, the prediction of intercalative DNA mutagenesis has been accomplished 1500

by docking of compounds into the enclosed space of a representative dinucleotide and its double-stranded complement [45,46]. However, such predictions have to be considered within their practical context: even though a compound may dock into a DNA-like structure in silico, it may never penetrate the nucleus to find an actual DNA molecule to bind to. Liver injury is one of the most frequent causes of compound attrition in clinical trials, as well as for the withdrawal of drugs from the market [47]. This type of toxicity can be particularly species and individual specific and, therefore, difficult to tackle in a laboratory setting. The FDA recently initiated a collaboration to identify biomarkers that could be used to identify the potential for liver injury by drugs in clinical testing; however, idiosyncratic toxicities remain a significant problem in this area [47]. Recent developments have been reported for identifying potential liver toxicity through a DEREK-based ruleset [48]. Most of these cases are due to reactive metabolites that might be identified using metabolism prediction software. Some methods for the prediction of metabolites are rule based or data driven, such as the program Meteor from Lhasa [26,49]. Homology models and crystal structures for the major human cytochrome P450 isoforms have been used to create interaction models to determine site of metabolism and potential time-dependent intermediates, for example, in the software MetaSite [50]. A comprehensive strategy to help tackle these problems computationally has yet to be reported and there is room for improvement even in devising simple rules. One important development of the last few years has been the expansion and sharing of data and statistical models in the toxicological community. Regulatory bodies and research funding agencies have provided resources for hosting data, as well as providing data from their own toxicity testing efforts. Toxnet, a database gateway from the National Institute of Medicine, integrates data on carcinogenicity, genotoxicity and other toxicity information [51]. This is complementary to an Environmental Protection Agency (EPA) gateway called Aggregated Computational Toxicology Resource [52], which combines EPA’s data with information from more than 500 public sources. PubChem, funded by the National Institutes of Health (NIH), provides data from toxicity assays deposited by users, and it provides links to other databases such as those from the European Molecular Biology Laboratory and the European Bioinformatics Institute. Another trend is the sharing of statistical models and data through collaborative sites such as OpenTox (http:// opentox.org/) [10] and OChem (http://www.ochem.eu/) [53]. Safety concerns due to promiscuity of compounds

2.2

Compounds are typically intended and designed to modulate one particular target protein. In practice, however, any compound is prone to interact with a range of other molecular entities present in the cellular environment.

Expert Opin. Drug Metab. Toxicol. (2011) 7(12)

Nigsch, Lounkine, McCarren, et al.

Substructure alert

6.7 - 7.7 eV

Yes

No absorption

Obtain UV spectra

or max ε < 2500 cm-1 M-1

max ε < 2500 cm-1 M-1 Dermal application

Yes

HOMO–LUMO calculation

Low probability of phototoxicity

Topical UV-LLNA

Expert Opin. Drug Metab. Toxicol. Downloaded from informahealthcare.com by Novartis Pharma on 01/20/15 For personal use only.

No 3T3 NRU in vitro

Negative

Low probability of phototoxicity

Positive No go; skin/eye distribution; LLNA; other studies

Figure 2. Early phototoxicity flowchart showing the steps taken in safety assessment. LLNA: Local lymph node assay.

Compounds that modulate multiple targets are referred to as polypharmacological or promiscuous. In addition to the primary therapeutic target of interest, such compounds have high affinity to secondary targets (off-targets) that can lead to ADRs. Consequently, their usage might be restricted or, in extreme cases, they are withdrawn from the market. By contrast, promiscuous compounds have been found to have beneficial therapeutic effects in many CNS disorders that are linked to multiple targets involving complex pharmacology. By means of example, clozapine is a promiscuous antipsychotic drug that has been on the market for the last 50 years for the treatment of schizophrenia [54]. There is an ongoing debate in the antibacterial [55] and oncology [56] fields about the value of nonselectivity within the protein kinase family. Similar to antipsychotics, protein kinase inhibitor drugs in oncology have relatively high promiscuity with the rare exception of ‘clean’ compounds [57]. Nevertheless, compound promiscuity clearly increases the possibility of unwanted adverse reactions in the clinic [58]. In the past few years, it has become standard practice to test compounds in dedicated safety panels covering many targets that have been associated with adverse events [23,59]. Such panels fulfill the dual purpose of highlighting activity at distant as well as closely related targets. For example, an enzyme inhibitor that also activates a membrane receptor is a distant offtarget effect, whereas the concurrent modulation of several serotonin (5-HT) receptor isoforms stems from insufficient specificity at closely related targets. It is important to be alerted to both cases, as they can equally lead to adverse events. Numerous pharmaceutical companies use safety panels to evaluate compounds in early stages of drug discovery to optimize compound series if promiscuity was found [58].

The data produced over a period of time in dedicated panels are an extremely rich source of information for data mining experiments related to predictive safety. Using a set of safety pharmacology profiling data published previously, it was reported that the percentage of compounds that displayed promiscuous properties during the lead optimization stage ranged from 20 to 30% (when using a cut-off of 50% inhibition at 10 µM concentration) [23]. By mining those data, empirical rules were established that discriminated between promiscuous and selective compounds using their structural features. For instance, both calculated n-octanol/water partition coefficient (AlogP) and molecular weight (MW) were significantly higher for promiscuous compounds compared with selective ones [24]. The number of nitrogen atoms was found to be higher for promiscuous compounds while the number of oxygen atoms was lower compared with selective compounds. By contrast, the number of H-bond donor or acceptor atoms was not significantly different between the two groups of compounds. Some structural motifs such as indole, furan and piperazine rings were overrepresented in the promiscuous set. Interestingly, compounds with a carboxylic acid group showed high selectivity, probably due to their negative charge that can prevent unfavorable interaction with multiple targets. (Other acidic groups such as tetrazole or sulfonamide did not show such a large difference.) Of the 585 compounds in the data set with a carboxylic acid, 79% were selective, 19% moderately promiscuous and only 2% promiscuous (in the MDL Drug Data Report, 20% of drugs contain a carboxylic acid moiety) [60]. Similar approaches were published recently to relate the origin of promiscuity to chemical structures and calculated descriptors. Peters et al. have analyzed 213 compounds

Expert Opin. Drug Metab. Toxicol. (2011) 7(12)

1501

Expert Opin. Drug Metab. Toxicol. Downloaded from informahealthcare.com by Novartis Pharma on 01/20/15 For personal use only.

Computational methods for early predictive safety assessment from biological and chemical data

profiled in 80 screening panels [61]. They have observed an increase in promiscuity above a ClogP value of 2. For basic compounds, the promiscuity seems to increase gradually with the calculated pKa. They also found that the presence of a positive charge can be a substantial factor for the increase in promiscuity. In another recent study [62], it was shown that promiscuity is correlated to the size of the molecule based on the calculated molecular frame, where the molecular frame is defined as the fraction of the size of the molecular framework versus the size of the whole molecule. The promiscuity of a compound can be either stoichiometric where ligand and protein are typically present in a 1:1 relation. However, there are compounds, especially those with many hydroxylic groups such a flavonoids (e.g., quercetin), that unspecifically bind to proteins in a non-stoichiometric way and are promiscuous because of their inherent physicochemical properties rather than their biological action. Such compounds can also influence assay technologies and result in false-positive readouts. Mining of adverse drug reactions While the classifications of promiscuity are largely based on structural features related to selected physicochemical properties, mitigation of ADRs associated with particular off-targets is largely structural-activity relationship (SAR) based and ideally performed in parallel with the optimization of the primary target. Computational approaches can help to uncover associations of adverse effects with chemical structure and bioactivity. In order to make ADRs amenable to computational methods, a sufficient number of observations of drug--ADR pairs has to be available. A crucial step in obtaining such a data set is the utilization of a standardized terminology for ADRs. The Medicinal Dictionary for Regulatory Activities (MedDRA) is an ontology that was assembled beginning in the 1990s to address this need. It covers different medicinal dictionaries including the World Health Organisation’s (WHO) adverse reaction terminology, the Coding Symbols for a Thesaurus of Adverse Reaction Terms and the International Classification of Diseases (ICD). MedDRA terms are organized in a hierarchical ontology comprising five levels and includes extensive synonym mapping [63]. Normalized terms are used to report ADRs in clinical trials or pharmacovigilance systems such as the Adverse Event Reporting System (AERS) of the US Food and Drug Administration (FDA) [64]. A database containing international ADR data is VigiBase (maintained by the WHO) [65]. These databases store reports from patients and health-care providers about adverse effects of medicines together with demographic, geographic and treatment data. Recently, a computer-readable and publicly available SIDe Effect Resource (SIDER) has been introduced, which seeks to make drug label and post-marketing ADR data readily available [66]. Such information sources can be used to associate drug targets and physicochemical properties with ADRs. 2.3

1502

Generally, if the mechanism of an adverse event can be determined, it falls into one of two broad categories: those that are mediated by primary (i.e., intended) targets and those mediated by secondary (i.e., unintended or off) targets. In the case of a target that has several closely related proteins, (for example, isoforms), the latter can in principle be overcome by increasing compound selectivity for the primary target; however, ADRs that are directly associated with an intended target are inherently problematic. Knowledge of the tissue-specific expression of targets may help to mitigate potential risk by optimizing the compound for specific pharmacokinetic properties. For example, drugs that cannot penetrate the blood--brain barrier, for example, second-generation antihistamines, are much less likely to cause effects in the CNS [67]. There exists a considerable amount of mechanistic understanding that links specific off-target proteins to particular ADRs. A prominent example is the case of arrhythmias caused by blocking of the hERG potassium channel [15]. Other specific off-target-mediated ADRs have been suggested and efforts have been made to collect such evidence from different data sources [68-71]. The Drug Adverse Reaction Target Database assembles information about ADR-related targets from the literature using automated text mining [68]. The DrugInduced Toxicity Related Proteins database provides similar information, but is manually curated [70]. Combination of kinase inhibition profiles and adverse reaction data together with literature text mining approaches enabled individual kinases to be associated with particular adverse events, for example, EGFR inhibition and diarrhea [72]. Furthermore, gene expression profiles have been used to link ADRs to biological processes, rather than individual targets [73]. Individual adverse effects that constitute the most serious threats to patients have led to models for particular ADRs. For example, predictive models for cardiac side effects (especially hERG binding) have been suggested that utilize quantitative structure--activity relationship (QSAR) [74-76] and other computational approaches, for example, support vector machines [77]. Other models focus on different tissues or organs, such as liver and kidney [78-81]. One limitation of QSAR models is their applicability to novel chemical space not covered by the training set. While SAR determinants related to individual molecular targets can often be adequately modeled, entire tissues or organs may reflect multiple underlying mechanisms with quite distinct SAR landscapes. This makes it difficult to represent structure--toxicity relationships using chemical descriptors. ADRs have also been used as experimental endpoints to describe what are essentially human phenotypes [82,83]. In these approaches, individual drugs are represented by profiles of the ADRs they are related to. These profiles can then be compared with each other in order to find drugs that exhibit similar phenotypic effects. ADR profile comparison thus represents an orthogonal approach to ligand-based molecular similarity. It has been shown that drugs with both ADR profile and chemical similarity are likely to have targets in common. Taking structural and ADR similarity into account,

Expert Opin. Drug Metab. Toxicol. (2011) 7(12)

Expert Opin. Drug Metab. Toxicol. Downloaded from informahealthcare.com by Novartis Pharma on 01/20/15 For personal use only.

Nigsch, Lounkine, McCarren, et al.

proteins targeted by the same drugs could be identified [82]. Comparison of ADR profiles with target-binding profiles allowed the identification of target signatures predictive of ADRs [83,84]. Relationships between in vitro binding profiles and adverse events have been established on the basis of clustering [83] and Bayesian models [84]. In the clustering approach, drugs have been compared using their ADR or target-binding profiles, respectively. Overlapping clusters were then extracted that linked specific target activities to a subset of the observed ADRs [83]. The Bayesian approach [84] compares models for target activities and ADRs that have both been trained using common chemical descriptors. Model correlation then quantified the association between individual targets and ADRs, for example, COX1 was associated with gastrointestinal side effects and muscarinic receptors with oral dryness [84]. Clustering approaches have been applied to both ADR and target-binding profiles. Clusters with characteristic, aggregated ADR profiles could then be aligned with targetbinding profiles; furthermore, those could be attributed to chemical compound classes, showing that target profiles can be predictive of ADRs. A systematic comparison in chemical space of ADRs of marketed drugs led to the identification of structural motifs associated with particular ADRs [85]. ADRs could then be clustered based on common characteristic chemical features, and emerging clusters were largely found to represent system organ classes from MedDRA corresponding to adverse effects, for example, psychiatric ADRs clustered together, as did cardiac ADRs [85]. An important limitation of human ADRs for model building is that they are normally not specific enough to represent the underlying molecular mechanism(s) of toxicity. However, the successful alignment of chemical features [82,85], as well as in vitro target-binding profiles [83,84] and gene expression data [73] with adverse reactions, has made it possible to express complex adverse events in terms of biological pathways and targets [86-89]. While use of public ADR data is interesting, the data themselves suffer from a number of issues such as overreporting in the case of class action lawsuits or public perception of a particular drug’s safety, underreporting of adverse reactions considered to be obvious (i.e., alopecia during chemotherapy) and data entered by the general public who are unaware of possible drug--drug or drug--food interactions. In summary, models can be developed to predict from chemical structure alone the most likely ADRs that a molecule could cause. In contrast to that, the set of ADRs that a molecule causes can be regarded as the readout of a sophisticated phenotypic assay directly related to the human organism, and models can also be built using these readouts. Preclinical animal safety data integration for predictive safety

2.4

As part of the drug development pipeline, nonclinical pharmacology and toxicology data must be produced in

animal models [90]. These data are mandatory for submission to regulatory agencies and are highly regulated in terms of experimental design and quality (good laboratory practices are enforced [91]). Animal models for predictive safety have been used for decades and the pharmaceutical industry as a whole has accumulated a massive amount of historical drug safety data in various animal species. According to our archives, Novartis and its parent companies conducted preclinical animal studies on more than 14,000 compounds and compound combinations over the past 50 years. Animal safety data represent the single largest, and probably least utilized, data set for new predictive toxicology models. Compared with human ADR data, the preclinical animal data are typically collected by experts, better controlled and more complete. For each animal in a study, exhaustive analysis is performed, which includes, among others, observational clinical analysis, the collection of many biomarkers, hematology, microscopic and macroscopic analysis of all tissues, gene expression, pharmacokinetic data and protein localization [47,92-94]. Furthermore, these data are gathered at several dose levels and time points and on many more compounds than make it into clinical studies; therefore, they cover a much larger chemical space. The ability to integrate and mine these data in order to learn from past compounds would be a major breakthrough for predictive drug safety. To this end, the European Innovative Medicines Initiative (IMI) eTOX (expert systems in Toxicology) consortium is bringing together 13 member companies of the European Federation of the Pharmaceutical Industries and Associations with academic and biotech partners to share and exploit such data to create new models for drug safety prediction. At Novartis, we are developing specific tools for data integration and exploration, which help in our predictive safety efforts, for example, a safety data warehouse storing our raw preclinical data, ontologies for pathology and anatomy, search engines and visualization tools for cross-study data analysis. Some of these tools will be made available to our partners in the IMI eTOX consortium and also to the wider drug safety community. It is our hope that this sharing of large amounts of animal data on specific toxicological endpoints can help move the science of predictive toxicology forward and also help with the adoption of data standards in this area. The creation of data standards is also driven by global projects such as the CDISC-SEND [95] submission of preclinical data to the FDA, which is to be implemented in 2011, as well as the Registration, Evaluation, Authorization and Restriction of Chemicals initiative in Europe [96]. Standard annotation is the first step for the creation of predictive models and many other groups are also developing ontologies related to in silico and in vitro toxicology such as CBES-DD [97], EPA Gene-Tox, the EPAA and ToxCast. See other sources for a listing of existing databases [98]. Until recently, apart from the field of toxicogenomics, the creation of tools for the specific analysis of data generated

Expert Opin. Drug Metab. Toxicol. (2011) 7(12)

1503

Expert Opin. Drug Metab. Toxicol. Downloaded from informahealthcare.com by Novartis Pharma on 01/20/15 For personal use only.

Computational methods for early predictive safety assessment from biological and chemical data

by the preclinical safety (PCS) sciences has largely been ignored by bioinformaticians. Now there are a number of efforts underway across the globe to address this exciting and complex area of research [97,99-103]. More than a decade ago, the PCS department at Novartis started a large datawarehousing effort to store raw histopathology, toxicology and clinical pathology data from all internal animal studies. This effort has led to the creation of our current data integration system called Inspire (INtegrated Safety and Preclinical Information REpository). This system contains data from more than 50 different structured data sources including the raw data for more than 3500 preclinical studies. Inspire also contains safety data from more than 1000 Novartis clinical studies, making it an extremely rich translational data mining resource. It also contains information on targets, compounds, mechanisms of action, drug pharmacology and approved drugs on the market, thus enabling data mining of a large body of drug safety data and other relevant information. The ability to mine integrated preclinical animal safety data across studies using standardized terminologies (such as CDISC-SEND) will allow us to easily ask questions of the data that many institutions cannot achieve today [95]. Use cases might include ‘Provide the chemical structure codes of all structures observed to cause hypertrophy in any tissue in treated but not control animals with increasing severity at increasing dose.’ The ability to mine the data in this way will first allow us to identify rapidly and efficiently events that occur in studies performed by different people or organizations in different geographic locations. Once we can answer such questions, we can begin to imagine using this data to do meaningful QSAR modeling based on PCS data. One of the basic tasks that such a system must be able to accomplish is to give a quick overview of available data using a simple search tool that is aware of all relevant data types and all synonyms for these elements. Once relevant data are found, one of the challenges is to allow exploration and visualization of large, complex data sets in a simple way. This is achieved by providing drill-down capabilities, for example, to go easily from a compound to an individual animal in a preclinical study. For example, Figure 3 describes the microscopic observations of anatomic pathology of an entire study in a simple heat map view [104,105]. Clicking on one of the animals in the heat map will lead to a page containing all related data such as in-life observation, anatomic pathology observations, clinical pathology, drug exposure, microarray experiments performed and samples in fridges and archives. An example of this can be seen in Figure 4, which shows one of the key images from a typical animal page, the overview of events for that specific animal across the study timeline. In conjunction with other graphs showing clinical pathology, drug exposure and organ weight data for the animal, this graph helps to understand the animal findings in the context of all other relevant data. 1504

Data retrieval, and especially integration, is challenging when dealing with nonstandardized data created by different people over a long period of time using multiple systems in a global organization. Two main efforts are needed to make this possible. First is the need to get standardized identifiers for each level of the data stack defining a given sample: project code, compound code (including salts and batches), study identifier, animal identifier, organ and tissues, dates, sample barcodes, sample storage location and availability. This is within the scope of data-warehousing activities, quality assurance and quality control (QA/QC) and can usually be resolved by integrating all the different sources and some manual curation. The second undertaking is the creation of ontologies to define and aggregate data collected from different studies. The use of ontologies is now common in biology and has led to the creation of tools and repositories such as the Open Biomedical Ontology project [106]. Ontologies are needed for anatomy (organs, tissues, cell types), histopathology findings, clinical pathology measurements, inlife observations, reproductive toxicology observations, study design and other areas. As an example of the complexity of this vocabulary, at Novartis we have determined that there are 11,000 histopathology terms in our database, at least 10 times more terms than those that are actually needed. These terms arise due to animal-specific descriptions, duplication of terms for the same finding across tissues (i.e., liver inflammation and kidney inflammation), alternative spellings, over-specification and use of synonyms. While ontologies for organs, tissues and cell types are relatively well established, public standard ontologies for preclinical toxicology are virtually nonexistent. This lack of a standard ontology for toxicology findings is especially striking when compared with the available ontologies for clinical trial findings (MedDRA [63]) or health-care billing and epidemiology (Snomed CT [107], ICD-9 [108]). Several initiatives are ongoing in order to create standard ontologies for PCS, such as the IMI eTOX consortium [109,110] and the Opentox consortium [10,96]. As part of the IMI eTOX initiative, there is a working group for the creation of common ontologies for anatomy, cell and tissue types, microscopic and macroscopic pathology, inlife observations and clinical pathology. The first version of the anatomy ontology was completed at the end of 2010, and the microscopic histopathology ontology is scheduled to be completed in 2011. Following review and curation of all the vocabularies from the partner companies, these ontologies will be used as part of the larger goal to integrate the findings from animal studies across all the companies. It is hoped that this initiative will not only help drive new standards in the safety sciences but also lead to new models for specific toxicology endpoints. It will also allow scientists to make better decisions as to which compounds should move forward into clinical studies, and thereby contribute to avoiding costly late-stage failures, and ultimately help to deliver safe drugs to patients.

Expert Opin. Drug Metab. Toxicol. (2011) 7(12)

Expert Opin. Drug Metab. Toxicol. Downloaded from informahealthcare.com by Novartis Pharma on 01/20/15 For personal use only.

Nigsch, Lounkine, McCarren, et al.

Figure 3. Inspire screen shot showing anatomic pathology data for a given study. The image shows animals on one axis and pathology findings on the other. Each intersection contains a dot colored according to the severity of the finding. When a finding is present, the grade is indicated using a color scale. Note that in this example the ontologies have not been applied to the findings.

Expert Opin. Drug Metab. Toxicol. (2011) 7(12)

1505

Expert Opin. Drug Metab. Toxicol. Downloaded from informahealthcare.com by Novartis Pharma on 01/20/15 For personal use only.

Computational methods for early predictive safety assessment from biological and chemical data

Figure 4. Inspire screen shot showing in-life observations, clinical pathology sampling and test phases for a given animal. The black bar represents study duration, vertical bars delimit study phases (pretest; test; recovery period, if any). Diamonds represent daily observations, triangles findings, dots clinical pathology sampling (mouse over dots is showing analytes measured and clicking on it is sending to corresponding data).

3.

Expert opinion

Decisions around predictive safety, or rather predicted potential safety liabilities, are made by various people at different times during the discovery and development cycle of new drugs. It is, therefore, imperative to integrate all data generated around a particular area of chemistry, biology and toxicology and facilitate their dissemination. Such an approach will allow the incorporation of many decisions earlier on in the process. In the following, we outline several factors that will help to achieve this goal. With today’s high-throughput capabilities in many areas of experimental science, the pace of data generation has surpassed the rate at which these data can be analyzed. To add further to the data deluge, many institutions have a wealth of historical data that are semantically not accessible in the same way as newly generated data. The recovery of data from the various isolated ‘data silos’ into a meaningful comprehensive repository can provide great benefit to put new observations into appropriate context. The obstacles to be overcome to achieve that, however, are numerous and hard to tackle on a large scale. More often than not, metadata of historical experiments is stored -- if available at all -- in a format that makes automatic processing very difficult. Even though text-mining approaches can provide some help, they can often be used only to provide some guidance for further manual curation. Without strict criteria for the capturing of experimental metadata, the problems of data incompatibility are unlikely to get closer to a solution. Current projects underway to create appropriate ontologies are, therefore, of extraordinary value. A major factor to advance PSA using computational methods will be the sharing of relevant data to mine. The sharing of toxicological data, however, is a controversial matter due to the significant implications of toxicological observations. 1506

Any advances in this area will greatly impact the field of PSA. The IMI eTOX project is, therefore, anticipated to be extremely valuable for the provision of well-structured data for the development of new predictive methods. There is a separation between preclinical and clinical sciences that hinders advances in PSA. A tighter integration of clinical observations with data generated earlier will provide useful information to guide future projects. Furthermore, the availability of relatively cheap methods to characterize clinical samples on a system level, using, for example, untargeted metabolomics and deep sequencing, can put clinical observations into a rich context. Such approaches will, obviously, generate even more data that require analysis. However, with the advent of personalized medicine, such methods are set to increase in importance. PSA can, however, also be thought of as a guiding principle for early discovery. As it stands, no hypothesis is needed, for example, to sequence a genome or to assay a corporate screening deck in a high-throughput screening (HTS) campaign. This is unfortunate because the actual goal of experimental science is verification and not falsification [111]. Therefore, the question at hand is can we find a way to use the massive amount of data in order to generate a hypothesis after all? More specifically, do we think broad enough and are we leveraging the existing knowledge to make effective decisions when assessing hits from a HTS campaign? Most of the current literature treats post-HTS activities as a ‘hit triaging’ process instead of ‘hit assessment.’ It is the opinion of the authors that the objective is not to exclude hits but to provide sufficient annotation to build a hypothesis around the hits [112]. This statement is valid for ‘black box’ screens where the readout is a certain phenotype and the mechanism of action is unknown; however, this statement holds true in biochemical screens as well because HTS is a reductionist approach to look into a specific biological question. For

Expert Opin. Drug Metab. Toxicol. (2011) 7(12)

Expert Opin. Drug Metab. Toxicol. Downloaded from informahealthcare.com by Novartis Pharma on 01/20/15 For personal use only.

Nigsch, Lounkine, McCarren, et al.

complex phenotypes, such as cell death, it might also be desirable to further deconvolute the biological processes involved. We must look not at one level, but at the interaction of processes across various levels: from a molecular and cellular level to that of organs and organisms [113]. There is a gap between the thousands of confirmed hits and the few compounds that a team can select for lowthroughput assays such as pull-down experiments. To bridge this gap, methods and workflows are available today to better frame computational biology and high-throughput screening with a hypothesis [89,114,115]. More specifically, large curated data sources of protein--protein interactions and compound--protein interactions are readily available today. However, this knowledge is rarely used to create focused sets to test compound--target--phenotype hypotheses. The integration of legacy compound--target information with the relevant biological context provides a fertile ground for the streamlining of HTS and systems biology in the assessment of the hits from a ‘black box’ screen. Some of these lessons are already applied in library design where there is an Bibliography

2.

3.

4.

CDC. Health, United States, 2009: With Special Feature on Medical Technology. National Center for Health Statistics. 2010;Hyattsville, MD Howard R, Avery A, Slavenburg S, et al. Which drugs cause preventable admissions to hospital? A systematic review. Br J Clin Pharmacol 2007;63:136-47 Lazarou J, Pomeranz B, Corey P. Incidence of adverse drug reactions in hospitalized patients: a meta-analysis of prospective studies. JAMA 1998;279:1200-5 The World Bank. World Development Indicators. 2008. Available from: http:// data.worldbank.org/data-catalog/worlddevelopment-indicators [Last accessed 4 October 2011]

5.

Meyer U. Pharmacogenetics and adverse drug reactions. Lancet 2000;356:1667-71

6.

Kola I, Landis J. Can the pharmaceutical industry reduce attrition rates? Nat Rev Drug Discov 2004;3:711-15

7.

8.

9.

F Nigsch, E Lounkine, and P McCarren are presidential postdoctoral fellows of the Education Office of the Novartis Institutes for BioMedical Research.

Declaration of interest All of the authors are employees of Novartis AG.

Sanderson D, Earnshaw C. Computer prediction of possible toxic action from chemical structure; the DEREK system. Hum Exp Toxicol 1991;10:261-73

electrophysiology, clinical QT interval prolongation and torsade de pointes for a broad range of drugs: evidence for a provisional safety margin in drug development. Cardiovasc Res 2003;58:32-45 15.

Sanguinetti M, Tristani-Firouzi M. hERG potassium channels and cardiac arrhythmia. Nature 2006;440:463-9

Hardy B, Douglas N, Helma C, et al. Collaborative development of predictive toxicology applications. J Chem 2010;2:7 Describes ongoing efforts for collaborations in toxicology.

16.

Finlayson K, Turnbull L, January CT, et al. [3H]Dofetilide binding to HERG transfected membranes: a potential high throughput preclinical screen. Eur J Pharmacol 2001;430:147-8

11.

Benigni R, Bossa C, Tcheremenskaia O, et al. Alternatives to the carcinogenicity bioassay: in silico methods, and the in vitro and in vivo mutagenicity assays. Expert Opin Drug Metab Toxicol 2010;6:809-19

17.

Ames B, McCann J, Yamasaki E. Methods for detecting carcinogens and mutagens with the Salmonella/ mammalian-microsome mutagenicity test. Mutat Res 1976;31:347-64

18.

12.

Curran ME, Splawski I, Timothy KW, et al. A molecular basis for cardiac arrhythmia: HERG mutations cause long QT syndrome. Cell 1995;80:795-803

13.

Sanguinetti MC, Jiang C, Curran ME, et al. A mechanistic link between an inherited and an acquird cardiac arrthytmia: HERG encodes the IKr potassium channel. Cell 1995;81:299-307

Zeiger E, Mortelmans K. The Salmonella (Ames) Test for Mutagenicity. In: Current Protocols in Toxicology. John Wiley & Sons, Inc.; 2001. Available from: http://dx.doi.org/10.1002/ 0471140856.tx0301s00 [Last accessed 4 October 2011]

19.

Hartung T. Toxicology for the twenty-first century. Nature 2009;460:208-12 Very insightful article on the state of toxicology and how it might change in the twenty-first century.

10.

.

DiMasi J, Hansen R, Grabowski H. The price of innovation: new estimates of drug development costs. J Health Econ 2003;22:151-85 Marchant C, Briggs K, Long A. In silico tools for sharing data and

Acknowledgements

knowledge on toxicity and metabolism: Derek for Windows, Meteor, and Vitic. Toxicol Mech Methods 2008;18:177-87

Papers of special note have been highlighted as either of interest () or of considerable interest () to readers. 1.

increasing amount of thought given to the biological relevance of screening collections [116,117]. In summary, we predict that advances in the integration of bioactivity, biological network and safety data across all biological scales involved----starting from individual molecular entities, over features of tissues, to clinical outcomes in patients----will both drive and benefit the development of modeling and simulation techniques, as well as prompt the conception of innovative visualization tools that allow effective interrogation of that data by specialists and nonspecialists alike.

14.

Redfern WS, Carlsson L, Davis AS, et al. Relationships between preclinical cardiac

Expert Opin. Drug Metab. Toxicol. (2011) 7(12)

..

1507

Computational methods for early predictive safety assessment from biological and chemical data

20.

..

21.

Expert Opin. Drug Metab. Toxicol. Downloaded from informahealthcare.com by Novartis Pharma on 01/20/15 For personal use only.

22.

Langley G, Evans T, Holgate ST, et al. Replacing animal experiments: choices, chances and challenges. Bioessays 2007;29:918-26 Describes the needs and possibilities for the replacement of animal experiments. Ferdowsian HR, Beck N. Ethical and scientific considerations regarding animal testing and research. PLoS ONE 2011;6:e24059 European Medicines Agency. International Conference on Harmonisation. ICH topic M 3 (R2): non-clinical safety studies for the conduct of human clinical trials and marketing authorization for pharmaceuticals (CPMP/ICH/286/95). Available from: http://www.ema.europa.eu/docs/en_GB/ document_library/Scientific_guideline/ 2009/09/WC500002720.pdf [Last accessed 4 October 2011]

23.

Hamon J, Azzaoui K, Whitebread S. In vitro safety pharmacology profiling. Eur Pharm Rev 2006;1:60-3

24.

Azzaoui K, Hamon J, Faller B, et al. Modeling promiscuity based on in vitro safety pharmacology profiling data. ChemMedChem 2007;2:874-80

25.

.

26.

27.

28.

29.

1508

Valentin J-P, Bialecki R, Ewart L, et al. A framework to assess the translation of safety pharmacology data to humans. J Pharmacol Toxicol Methods 2009;60:152-8 Article providing comprehensive information on model assessment. Ridings J, Barratt M, Cary R, et al. Computer prediction of possible toxic action from chemical structure: an update on the DEREK system. Toxicology 1996;106:267-79 Benigni R, Bossa C. Structure alerts for carcinogenicity, and the Salmonella assay system: a novel insight through the chemical relational databases technology. Mutat ResRev Mutat Res 2008;659:248-61 Benigni R, Netzeva T, Benfenati E, et al. The expanding role of predictive toxicology: an update on the (Q)SAR models for mutagens and carcinogens. J Environ Sci Health C Environ Carcinog Ecotoxicol Rev 2007;25:53-97 Snyder R, Pearl G, Mandakas G, et al. Assessment of the sensitivity of the computational programs DEREK, TOPKAT, and MCASE in the prediction of the genotoxicity of

effects on the photoinduced acute toxicity of PAHs. Chemosphere 1995;30:2129-42

pharmaceutical molecules. Environ Mol Mutagen 2004;43:143-58 30.

31.

32.

33.

34.

Naven R, Louise-May S, Greene N. The computational prediction of genotoxicity. Expert Opin Drug Metab Toxicol 2010;6:797-807 In Silico Mutagenicity DEREK. Available from: http://www.docstoc.com/ docs/39733913/In-Silico-MutagenicityDEREK [Last accessed 21 June 2011] Klopman G, Ptchelintsev D. Antifungal triazole alcohols: a comparative analysis of structure-activity, structure-teratogenicity and structure-therapeutic index relationships using the Multiple Computer-Automated Structure Evaluation (Multi-CASE) methodology. J Comput Aided Mol Des 1993;7:349-62 Matthews E, Kruhlak N, Cimino M, et al. An analysis of genetic toxicity, reproductive and developmental toxicity, and carcinogenicity data: II. Identification of genotoxicants, reprotoxicants, and carcinogens using in silico methods. Regul Toxicol Pharmacol 2006;44:97-110 Dewar M, Zoebisch E, Healy E, et al. The development and use of quantum-mechanical molecular-models. 76. AM1 - a new general-purpose quantum-mechanical molecular-model. J Am Chem Soc 1985;107:3902-9

35.

Stewart J. Optimization of parameters for semiempirical methods 1. Method J Comput Chem 1989;10:209-20

36.

Benigni R, Bossa C, Netzeva T, et al. Mechanistic QSAR of aromatic amines: new models for discriminating between homocyclic mutagens and nonmutagens, and validation of models for carcinogens. Environ Mol Mutagen 2007;48:754-71

37.

Hayashi M, Nakamura Y, Higashi K, et al. A quantitative structure-Activity relationship study of the skin irritation potential of phenols. Toxicol In Vitro 1999;13:915-22

38.

Soffers A, Boersma M, Vaes W, et al. Computer-modeling-based QSARs for analyzing experimental data on biotransformation and toxicity. Toxicol In Vitro 2001;15:539-51

39.

Veith G, Mekenyan O, Ankley G, et al. A QSAR analysis of substituent

Expert Opin. Drug Metab. Toxicol. (2011) 7(12)

40.

Ringeissen S, Marrot L, Note R, et al. Development of a mechanistic SAR model for the detection of phototoxic chemicals and use in an integrated testing strategy. Toxicol In Vitro 2011;25:324-34

41.

Netzeva T, Aptula A, Benfenati E, et al. Description of the electronic structure of organic chemicals using semiempirical and ab initio methods for development of toxicological QSARs. J Chem Inf Model 2005;45:106-14

42.

Lee C, Yang W, Parr R. Development of the Colle-Salvetti correlation-energy formula into a functional of the electron-density. Phys Rev B 1988;37:785-9

43.

Becke A. Density-functional thermochemistry 3. The role of exact exchange. J Chem Phys 1993;98:5648-52

44.

Hillebrecht A, Muster W, Brigo A, et al. Comparative evaluation of in silico systems for ames test mutagenicity prediction: scope and limitations. Chem Res Toxicol 2011;24:843-54

45.

Snyder R. Assessment of atypical DNA intercalating agents in biological and in silico systems. Mutat Res 2007;623:72-82

46.

Mahesh V, Ewing D, Hendry L. Assessing activity and toxicity of drugs in silico based on DNA structure. Med Chem Res 2008;17:159-68

47.

McBurney R, Hines W, Von Tungeln L, et al. The liver toxicity biomarker study: phase I design and preliminary results. Toxicol Pathol 2009;37:52-64

48.

Greene N, Fisk L, Naven R, et al. Developing structure-activity relationships for the prediction of hepatotoxicity. Chem Res Toxicol 2010;23:1215-22 Showcases the importance of hepatotoxicity in drug development.

.

49.

Langowski J, Long A. Computer systems for the prediction of xenobiotic metabolism. Adv Drug Deliv Rev 2002;54:407-15

50.

Vaz R, Zamora I, Li Y, et al. The challenges of in silico contributions to drug metabolism in lead optimization. Expert Opin Drug Metab Toxicol 2010;6:851-61

51.

Fonger G, Stroup D, Thomas P, et al. TOXNET: a computerized collection of

Nigsch, Lounkine, McCarren, et al.

toxicological and environmental health information. Toxicol Ind Health 2000;16:4-6 52.

Expert Opin. Drug Metab. Toxicol. Downloaded from informahealthcare.com by Novartis Pharma on 01/20/15 For personal use only.

53.

54.

55.

56.

.

Judson R, Richard A, Dix D, et al. ACToR--Aggregated Computational Toxicology Resource. Toxicol Appl Pharmacol 2008;233:7-13 Sushko I, Novotarskyi S, Korner R, et al. Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information. J Comput Aided Mol Des 2011;25:533-54 Roth B, Sheffler D, Kroeze W. Magic shotguns versus magic bullets: selectively non-selective drugs for mood disorders and schizophrenia. Nat Rev Drug Discov 2004;3:353-9 Fialho A, Das Gupta T, Chakrabarty A. Designing promiscuous drugs? Look at what nature made! Lett Drug Des Discov 2006;4:40-3 Mencher S, Wang L. Promiscuous drugs compared to selective drugs (promiscuity can be a virtue). BMC Clin Pharmacol 2005;5:3 A comparison of promiscuous and selective drugs.

57.

Frantz S. Drug discovery: playing dirty. Nature 2005;437:942-3

58.

Whitebread S, Hamon J, Bojanic D, et al. Keynote review: in vitro safety pharmacology profiling: an essential tool for successful drug development. Drug Discov Today 2005;10:1421-33

59.

Entzeroth M, Chapelain B, Guilbert J, et al. High throughput drug profiling. J Automot Method Manag 2000;22:171-3

60.

61.

62.

MDL Drug Report. Available from: http://accelrys.com/products/databases/ bioactivity/mddr.html [Last accessed 21 June 2011] Peters J, Schnider P, Mattei P, et al. Pharmacological promiscuity: dependence on compound properties and target specificity in a set of recent Roche compounds. ChemMedChem 2009;4:680-6 Yang Y, Chen H, Nilsson I, et al. Investigation of the relationship between topology and selectivity for druglike molecules. J Med Chem 2010;53:7709-14

63.

Brown E, Wood L, Wood S. The medical dictionary for regulatory activities (MedDRA). Drug Saf 1999;20:109-17

for prediction of hERG blockage. J Chem Inf Model 2010;50:1304-18 75.

Frid A, Matthews E. Prediction of drug-related cardiac adverse effects in humans--B: use of QSAR programs for early detection of drug-induced cardiac toxicities. Regul Toxicol Pharmacol 2010;56:276-89

64.

Adverse Event Reporting System (AERS). FDA AERS. Available from: http://www. fda.gov/Drugs/InformationOnDrugs/ ucm135151.htm [Last accessed 4 October 2011]

76.

65.

Hammond I, Gibbs T, Seifert H, et al. Database size and power to detect safety signals in pharmacovigilance. Expert Opin Drug Saf 2007;6:713-21

Song M, Clark M. Development and evaluation of an in silico model for hERG binding. J Chem Inf Model 2006;46:392-400

77.

66.

Kuhn M, Campillos M, Letunic I, et al. A side effect resource to capture phenotypic effects of drugs. Mol Syst Biol 2010;6:343

67.

Lieberman P. Histamine, antihistamines, and the central nervous system. Allergy Asthma Proc 2009;30:482-6

Doddareddy M, Klaasse E, Shagufta et al. Prospective validation of a comprehensive in silico hERG model and its applications to commercial compound and drug databases. ChemMedChem 2010;5:716-29

78.

Matthews E, Kruhlak N, Benz R, et al. Identification of structure-activity relationships for adverse effects of pharmaceuticals in humans: Part C: use of QSAR and an expert system for the estimation of the mechanism of action of drug-induced hepatobiliary and urinary tract toxicities. Regul Toxicol Pharmacol 2009;54:43-65

79.

Ekins S, Williams A, Xu J. A predictive ligand-based Bayesian model for human drug-induced liver injury. Drug Metab Dispos 2010;38:2302-8

80.

Cruz-Monteagudo M, Cordeiro M, Borges F. Computational chemistry approach for the early detection of drug-induced idiosyncratic liver toxicity. J Comput Chem 2008;29:533-49

81.

Rodgers A, Zhu H, Fourches D, et al. Modeling liver-related adverse effects of drugs using knearest neighbor quantitative structure-activity relationship method. Chem Res Toxicol 2010;23:724-32

82.

Campillos M, Kuhn M, Gavin A, et al. Drug target identification using side-effect similarity. Science 2008;321:263-6

83.

Fliri A, Loging W, Thadeio P, et al. Analysis of drug-induced effect patterns to link structure and side effects of medicines. Nat Chem Biol 2005;1:389-97

84.

Bender A, Scheiber J, Glick M, et al. Analysis of pharmacology data and the prediction of adverse drug reactions and

68.

Ji Z, Han L, Yap C, et al. Drug adverse reaction target database (DART): proteins related to adverse drug reactions. Drug Saf 2003;26:685-90

69.

Matthews E, Frid A. Prediction of drug-related cardiac adverse effects in humans--a: creation of a database of effects and identification of factors affecting their occurrence. Regul Toxicol Pharmacol 2010;56:247-75

70.

Zhang J, Huang W, Zeng J, et al. DITOP: drug-induced toxicity related protein database. Bioinformatics 2007;23:1710-12

71.

Yang L, Luo H, Chen J, et al. SePreSA: a server for the prediction of populations susceptible to serious adverse drug reactions implementing the methodology of a chemical-protein interactome. Nucleic Acids Res 2009;37:W406-12

72.

73.

74.

Yang X, Huang Y, Crowson M, et al. Kinase inhibition-related adverse events predicted from in vitro kinome and clinical trial data. J Biomed Inform 2010;43:376-84 Lee S, Lee KH, Song M, et al. Building the process-drug-side effect network to discover the relationship between biological processes and side effects. BMC Bioinformatics 2011;12(Suppl 2):S2 Su B, Shen M, Esposito E, et al. In silico binary classification QSAR models based on 4D-fingerprints and MOE descriptors

Expert Opin. Drug Metab. Toxicol. (2011) 7(12)

1509

Computational methods for early predictive safety assessment from biological and chemical data

off-target effects from chemical structure. ChemMedChem 2007;2:861-73

Expert Opin. Drug Metab. Toxicol. Downloaded from informahealthcare.com by Novartis Pharma on 01/20/15 For personal use only.

85.

Scheiber J, Jenkins J, Sukuru S, et al. Mapping adverse drug reactions in chemical space. J Med Chem 2009;52:3103-7

86.

Stones DK. Vincristine overdosage in paediatric patients. Med Pediatr Oncol 1998;30:193

87.

Berger S, Iyengar R. Role of systems pharmacology in understanding drug adverse events. Wiley interdisciplinary reviews. Syst Biol Med 2011;3:129-35 Introduction to system-level analyses for toxicology.

.

88.

89.

90.

91.

92.

93.

Wallach I, Jaitly N, Lilien R. A structure-based approach for mapping adverse drug reactions to the perturbation of underlying biological pathways. PLoS ONE e12063 2010;5 Scheiber J, Chen B, Milik M, et al. Gaining insight into off-target mediated effects of drug candidates with a comprehensive systems chemical biology analysis. J Chem Inf Model 2009;49:308-17 FDA. FDA Regulatory Information. Available from: http://www.fda.gov/ downloads/Drugs/Guidance ComplianceRegulatoryInformation/ Guidances/UCM079234.pdf OECD. OECD Principles of Good Laboratory Practice (as revised in 1997). OECD Environmental Health and Safety Publications; Environment Directorate, OECD, Paris 1998. p. 1 Beger RD, Sun J, Schnackenberg LK. Metabolomics approaches for discovering biomarkers of drug-induced hepatotoxicity and nephrotoxicity. Toxicol Appl Pharmacol 2010;243:154-66 Patterson TA, Li M, Hotchkiss CE, et al. Toxicity assessment of pramipexole in juvenile rhesus monkeys. Toxicology 2010;276:164-71

94.

De Jong WH, Van Loveren H. Screening of xenobiotics for direct immunotoxicity in an animal study. Methods 2007;41:3-8

95.

Clinical Data Interchange Standards Consortium. Standard for Exchange of Nonclinical Data (SEND). Available from: http://www.cdisc.org/send/ [Last accessed 4 October 2011]

1510

96.

openTox. openTox. Available from: http://www.opentox.org/ [Last accessed 4 October 2011]

97.

Fostel J, Choi D, Zwickl C, et al. Chemical effects in biological systems--data dictionary (CEBS-DD): a compendium of terms for the capture and integration of biological study design description, conventional phenotypes, and ’omics data. Toxicol Sci 2005;88:585-601

98.

..

99.

Valerio L. In silico toxicology for the pharmaceutical sciences. Toxicol Appl Pharmacol 2009;241:356-70 Comprehensive review on in silico techniques for toxicology. Vilar S, Harpaz R, Chase HS, et al. Facilitating adverse drug event detection in pharmacovigilance databases using molecular structure similarity: application to rhabdomyolysis. J Am Med Inform Assoc 2011. Available from: http://jamia.bmj.com/ content/early/2011/09/21/amiajnl-2011000417.abstract

100. Audouze K, Grandjean P. Application of Computational Systems Biology to Explore Environmental Toxicity Hazards. Environ Health Perspect 2011. Available from: http://dx.doi.org/ 10.1289%2Fehp.1103533 101. Davis AP, Murphy CG, Saraceni-Richards CA, et al. Comparative toxicogenomics database: a knowledgebase and discovery tool for chemical--gene--disease networks. Nucleic Acids Res 2009;37:D786-92 102. Hayes KR, Vollrath AL, Zastrow GM, et al. EDGE: a centralized resource for the comparison, analysis, and distribution of toxicogenomic information. Mol Pharmacol 2005;67:1360-8 103. McHale CM, Zhang L, Hubbard AE, et al. Toxicogenomic profiling of chemically exposed humans in risk assessment. Mutat ResRev Mutat Res 2010;705:172-83 104. Sneath P. The application of computers to taxonomy. J Gen Microbiol 1957;17:201-26 105. Lobenhofer E, Boorman G, Phillips K, et al. Application of visualization tools to the analysis of histopathological data

Expert Opin. Drug Metab. Toxicol. (2011) 7(12)

enhances biological insight and interpretation. Toxicol Pathol 2006;34:921-8 106. Smith B, Ashburner M, Rosse C, et al. The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nat Biotechnol 2007;25:1251-5 107. Cornet R, de Keizer N. Forty years of SNOMED: a literature review. BMC Med Inform Decis Mak 2008;8(Suppl 1):S2 108. Centers for Disease Control and Prevention. International Classification of Diseases (ICD). Available from: http:// www.cdc.gov/nchs/icd.htm [Last accessed 4 October 2011] 109. eTox. Expert Systems in Toxicology (eTox) project. Available from: http:// www.etoxproject.eu [Last accessed 4 October 2011] 110. Christensen F, Eisenreich S, Rasmussen K, et al. European experience in chemicals management: integrating science into policy. Environ Sci Technol 2011;45:80-9 111. Glass D. A critique of the hypothesis, and a defense of the question, as a framework for experimentation. Clin Chem 2010;56:1080-5 .. Article in support of curiosity as driver for experiments. 112. Langer T, Hoffmann R, Bryant S, et al. Hit finding: towards “smarter” approaches. Curr Opin Pharmacol 2009;9:589-93 113. Noble D. The music of life: biology beyond the genome. Oxford University Press; Oxford University Press Inc., New York 2006 114. Keiser M, Setola V, Irwin J, et al. Predicting new molecular targets for known drugs. Nature 2009;462:175-81 115. Nidhi Glick M, Davies J, et al. Prediction of biological targets for compounds using multiple-category Bayesian models trained on chemogenomics databases. J Chem Inf Model 2006;46:1124-33 116. Jacoby E, Schuffenhauer A, Popov M, et al. Key aspects of the Novartis compound collection enhancement project for the compilation

Nigsch, Lounkine, McCarren, et al.

of a comprehensive chemogenomics drug discovery screening collection. Curr Top Med Chem 2005;5:397-411 117.

Renner S, Popov M, Schuffenhauer A, et al. Recent trends and observations in the design of high-quality screening collections. Future Med Chem 2011;3:751-66

Expert Opin. Drug Metab. Toxicol. Downloaded from informahealthcare.com by Novartis Pharma on 01/20/15 For personal use only.

Affiliation

Florian Nigsch†1,6, Eugen Lounkine2, Patrick McCarren3, Ben Cornett1, Meir Glick2, Kamal Azzaoui4, Laszlo Urban2, Philippe Marc5, Arne Mu¨ller5, Florian Hahne5, David J Heard5 & Jeremy L Jenkins1 † Author for correspondence 1 Novartis Institutes for BioMedical Research, Inc., Chemical Biology Informatics, Quantitative Biology, Developmental and Molecular Pathways, 220 Massachusetts Avenue, 02139 Cambridge, MA, USA E-mail: [email protected] 2 Novartis Institutes for BioMedical Research, Inc., Center for Proteomic Chemistry, 250 Massachusetts Avenue, 02139 Cambridge, MA, USA 3 Novartis Institutes for BioMedical Research, Inc., Global Discovery Chemistry, 100 Technology Square, 02139 Cambridge, MA, USA 4 Novartis Institutes for BioMedical Research, Center for Proteomic Chemistry, Molecular Library Informatics, Lead Finding Platform, Novartis Campus Basel, Switzerland 5 Novartis Institutes for BioMedical Research, Preclinical Safety Informatics, Novartis Campus Basel, CH-4056 Basel, Switzerland 6 Novartis Institutes for BioMedical Research, Chemical Biology Informatics, Quantitative Biology, Developmental and Molecular Pathways, Novartis Campus Basel, CH-4056 Basel, Switzerland

Expert Opin. Drug Metab. Toxicol. (2011) 7(12)

1511