Current Topics in Medicinal Chemistry, 2005, 5, 751-762
751
Library Design for Fragment Based Screening Ansgar Schuffenhauer*, Simon Ruedisser, Andreas L. Marzinzik, Wolfgang Jahnke, Marcel Blommers, Paul Selzer and Edgar Jacoby Novartis Institutes of Biomedical Research, CH-4002 Basel, Switzerland Abstract: According to Hann’s model of molecular complexity an increased probability of detection binding to a target protein can be expected when small, low complex molecular fragments are screened with high sensitivity instead of fullsized ligands with lower sensitivity. Analysis of the HTS summary data of Novartis and comparison with NMR screening results obtained on generic fragment libraries indicate this expectation to be true with hitrates of 0.001% - 0.151% observed in the identification of ligands with an IC50 threshold in the micromolar range in an HTS setup and hitrates above or equal to 3% observed in NMR screening of fragments with an affinity threshold in the millimolar range. It is however necessary to keep in mind that the sets of target studied were not identical for both method and the experience in NMR screening is too limited for a final conclusion. The term hitrate as used here reflects only the success rate in the observation of ligand binding event. It must not be confused with the overall success rate of fragment and high throughput screening in the lead finding process, which can be entirely different, since the steps required to follow-up a ligand binding event to a lead are different for both methods. A survey of fragment-based lead discovery case studies given in the literature shows that in approximately half of the cases the initial hit fragment was discovered by screening a generic library, whereas in the other cases some knowledge about an initial ligands or the protein binding site has been used, whereas systematic virtual screening of fragment databases has been only rarely reported. As comparatively high hitrates were obtained, further consideration to optimize the generic fragment screening library were directed to the chemical tractability of the fragment. As several functional groups preferred by chemists for modification and linking of the fragments are also preferentially involved in interactions between the fragments and the target protein, a set of screening fragments was derived from chemical building blocks by masking its linker group by a chemical transformation which can be later on used in the chemical follow-up of the fragment hit. For example primary amines can be masked as acetamides. If the screening fragment is active the related building block can then be used for synthesis of a follow-up library.
INTRODUCTION Over the last decade high throughput screening (HTS) of large molecular libraries in cellular or enzymatic assays has been the dominant technology used for lead discovery. However the expectation that, provided the library of compounds screened is large enough, one will certainly identify a suitable lead has not materialized in all cases. Examining catalogues of commercially available screening catalogues, Ertl [1] discovered that the number of new substituents found does increase linearly with the number of structures examined, indicating that the chemistry space is by far not being sampled completely. Estimations by Guida et al. [2] and Villar et al. [3] indicate that the size of the chemistry space comprises approximately 1060 molecules, making its complete sampling with any realistic screening library size impossible. Therefore the size of the screening library could be theoretically increased almost infinitely without increasing its redundancy described by the number of compounds existing per chemical class. A mathematical model by Harper et al. [4] indicates that such an increase of the screening library size will lead to a linear average *Address correspondence to this author at the Novartis Institutes of Biomedical Research, CH-4002 Basel, Switzerland; E-mail:
[email protected] 1568-0266/05 $50.00+.00
increase of the number of hit series found in screening. However, the probability critical for the success of an HTS, which is the probability to find at least one hit series, does not increase linearly with the library size. Therefore, different lead finding approaches are needed which also require other compound libraries than those used typically for HTS. One of these methods is the fragment-based screening. The rationale for fragment based screening and the consequences resulting from it for the design of the libraries used in this approach are discussed in this article. THE HANN COMPLEXITY MODEL From a simplified ligand-binding model described by Hann et al. [5] a lead finding approach different from HTS can be derived. Hann decomposed the probability pb,d to detect a binding event in the probability pb that the binding event occurs and the probability pd that if a binding event occurs it leads to an affinity high enough for detection (Fig. 1, Equation 1). pb,d = pb(c,nmode,nmismatch) * pd(c-2nmismatch)
(Eq 1)
In Hann’s model, the ligand is described as an object of having a variable number c of binary structural features which can either match with a structural feature of the © 2005 Bentham Science Publishers Ltd.
752
Current Topics in Medicinal Chemistry, 2005, Vol. 5, No. 8
Schuffenhauer et al.
maximal. Further increasing the complexity of the ligands decreases the probability to detect a perfectly matching ligand. However, the probability of the detection of a ligand with one mismatch increases, meaning that the overall probability of detection of a binding event may slightly increase. This comes however at a price since a not perfectly matching ligand will still have a low affinity in relation to its molecular weight. Some of its features are not necessary for binding and could be removed without loosing activity, but one can only find out which features these are by trial and error. Therefore, according to this model, the ligands with the lowest complexity for which there is a reasonable probability of detection should be screened. Another reason for screening smaller ligands is also the quality of the sampling of the chemical space. Since the number of compounds in the chemical space increases exponentially with the molecular size limit applied, the smaller the applied size limit is, the more efficient the corresponding chemical space can be sampled with a given number of compounds [6]. The ligand efficiency introduced by Kuntz [7] and Hopkins [8] as the free binding energy ∆G per non-H atom is an experimentally accessible measure how perfect a ligand matches its binding site. A perfectly matching ligand according to the Hann model could have a ligand efficiency coming close to the limit of -6.3 kJ/non-H atom quoted by Kuntz [7].
Fig. (1). Application of the Hann model [5] to fragment based screening. (a) Effect of increasing the ligand complexity while keeping the sensitivity of the assay constant: Increasing the complexity of the ligands screened increases the fraction of ligands matching only partially and decreases the fraction of perfectly matching ligands. (b) Effect of screening with a higher sensitivity: The apparent hitrate increases when ligands with a lower complexity are screened.
binding site and contribute to the binding, or not. The number c serves as measure for complexity. Based on this model it is possible to determine by numerical simulation the probability pb that for a ligand with complexity c exist exactly nmode binding modes in which the number of feature mismatches is exactly nmismatch. Ligands with exactly one binding mode (nmode =1) leading to a clear SAR (Structure Activity Relationship) are preferential and only these are investigated further. The probability that the binding of a ligand is detected depends on the binding affinity, to which in this simplified model each matching feature contributes equally. A feature mismatch does not contribute positively to the affinity and requires in addition compensation by a matching feature, thus the affinity is proportional to c2nmismatch. A sigmoid curve is used by Hann based on empirical considerations to describe dependency that a binding can be detected on the affinity. Fig. (1a) shows that an increase of complexity leads to an increased probability of the detection of binding events pb,d up to a complexity of 5. At this complexity, the probability to detect the binding of a perfectly matching ligand is
The probability pd reflects the sensitivity of the assay used. Increasing the sensitivity of an assay means to reduce the affinity of a ligand required to detect its binding and therefore to shift the curve for pd leftwards to lower complexity as is illustrated in (Fig 1b). This also shifts the maximum of the probability distribution pb,d to detect the binding of a perfectly matching ligand towards lower complexity whereby its maximal value is increased. If one can increase the sensitivity of the assay, one can thus increase the probability of finding a hit binding in a unique way, provided one screens molecules with a lower complexity. Due to the higher hit rate one can accept a lower throughput of the assay, because only a smaller number of compounds needs to be screened for the identification of hits. The lower number of structural features participating in interaction between ligand and binding site will result in weaker binding. FRAGMENT SCREENING TECHNOLOGIES Fortunately this scenario is not only hypothetical. In the last decade several screening technologies have been described which have an up to 1000-fold higher sensitivity compared to the cell based and cell free assays used for HTS. These methods allow to detect the weak binding of small molecular fragments, and their use in lead discovery is often called fragment-based screening (FBS). The differences of the chemical shift of the NMR signals of a labelled protein or the changes in the relaxation time of the NMR signals of a ligands can be used to detect binding to a target protein [9-12]. When crystals of target proteins are soaked in a concentrated solutions of small molecules, these molecules may diffuse into the crystal and occupy existing cavities. By X-ray structure analysis, the occupancy of the
Library Design for Fragment Based Screening
cavity can be detected and the binding mode of the molecular fragment may be determined [13, 14]. Also mass spectroscopy [15, 16] and calorimetry [17, 18] can be used to detect the binding of small molecular fragments with high sensitivity. Last but not least, also classical biochemical screening techniques can be adapted for higher sensitivity by screening higher concentrated solutions, since the sensitivity of such assays is not an inherent property of biochemical screening, but a consequence of the expectation to find highly active ligands in a micro to nano-molar range of activity from the beginning on and consequent optimization of the assay design for this activity ranges. VALIDATING THE PREDICTIONS OF THE HANN MODEL The existence of a variety of fragment screening methods allows us to evaluate the predictions of the Hann model in practice. We analyzed how well different counts of structural elements from the number of atoms to the occurrence of more complex structural features encoded in molecular descriptors were suitable in separating sets of biologically active and inactive molecules [19]. The Similog keys [20] came out as the most efficient structural feature count in this aspect, and were used here as complexity measure for the valuation of the Hann model. The Similog keys are representing pharmacophoric atom triplets, which are characterized by the bond count of the shortest path between the three atoms, and the properties of the three atoms. Four atom properties are recognized independently from each other: H-bond donor, H-bond acceptor, lipophilicity (recognized as absence of electronegativity), and bulkiness of substituents. We compared the distributions of Similog key counts as complexity measure of the active and inactive molecules in the compounds screened with HTS and NMR screening. In HTS, we used the IC50 summary data of 132 assays with each more than 1000 IC 50 measured to define the
Current Topics in Medicinal Chemistry, 2005, Vol. 5, No. 8 753
set of active molecules with different activity thresholds. As inactive in HTS, we regarded all those compounds which were in the screening library for at least two years and had in this timeframe never an activity above the relevant threshold in any primary screen [16]. For the NMR screening, the results obtained by screening the existing fragment libraries described by Jacoby et al. [21] were used as basis and we regarded any compound as active which showed activity in NMR screening (T1ρ and WaterLOGSY experiments) and which could be confirmed by a second experiment. As inactive were counted any compound screened by NMR in at least four different targets and never found active. As shown in (Fig. 2a), the actives found in HTS screening have rarely less than 200 Similog triple keys, in contrast to the 50% of the inactive compounds fulfilling this condition. In NMR screening however, compounds with 13-37 or more Similog triple keys are detected as binders and 305 of the binders found are in this range of complexity (see Fig. 2b). Only in the lowest complexity range from 0-12 Similog triple keys few binders were found (7%) compared to 28% of all non binders. This shows that with more sensitive detection methods, indeed less complex molecules can be detected than by classical HTS assays. According to the Hann model, we can expect a higher probability to detect a target-binding event when screening less complex molecules with higher sensitivity. In 2003, the hit rate for the detection of ligands with an IC50 smaller than ~10µM in screening the classical Novartis HTS collection of full sized ligands (~500 K–1000 K compounds, molecular weight mostly in the range of 200 to 600) in an HTS setup were in the range of 0.001% - 0.151%. In the NMR screening runs on our existing generic fragment libraries (0.8 K-1.2K fragments of molecular weight 100-300, see also below), hit rates of 3-30% were obtained in the detection of fragments binding to the target protein with an affinity at least in the millimolar range. This data indicates that the prediction of the Hann model is valid. However, it is
Fig. (2). Comparison between complexity (expressed as count of unique Similog [20] pharmacophore triple keys) of actives and inactives in HTS and NMR screening. In HTS ligands with a complexity of less than 200 unique Similog keys are rarely active, there is almost no lower complexity threshold in NMR screening and only ligands with less than 13 Similog keys are mostly inactive.
754
Current Topics in Medicinal Chemistry, 2005, Vol. 5, No. 8
necessary to bear in mind that the sets of targets to which both methods have been applied is not identical and the number of NMR screens run is still limited. The term hitrate as it used here means only the success rate in observing a binding event, and must not be confused with the overall success rate of both methods in the whole lead finding process, which can be entirely different, since the way how an observed ligand binding is followed up to a lead, is different for fragment and high-throughput screening. The possibility that screening the HTS full-sized compound collection with a millimolar affinity threshold would also yield hitrates equal to those obtained in FBS is neither denied here, nor would such a possibility be a contradiction to the Hann model in the extended version used here. As discussed above, the probability to detect a binding event does not decrease, when molecules exceed the minimum complexity required by the assay affinity. Since the affinity is not expected to increase due to the higher frequency of mismatches, the ligand efficiency of the ligands detected by such an experiment would be only low, limiting their use in lead discovery. More difficult to answer is the question, whether the smaller fragments are matching their ligands more perfectly than the more complex ligands of the HTS collection. The problem arising here is that from biophysical screening methods like NMR binding constants KD are derived, while most HTS methods give IC50 values, or at best inhibition constants Ki, which are not directly comparable. At Novartis the average ligand efficiency determined from fragment based screening is -2.4 kJ/non-H-atom. This contrasts with an average value of -1.2 kJ/non-H-atom determined on the basis of published KD values [10, 11, 24-30] for NMR screening. The differences between these values are probably attributed to the selection criteria applied for the KD determination by titration in the NMR, which involves a high effort and is at Novartis only done for very promising NMR screening hits. Analysing again the IC50 summary data for HTS and assuming that IC50 and KD have approximately the same order of magnitude, a ligand efficiency of -1.2 kJ/ nonH-atom is obtained. Thus, it cannot be claimed with certainty, that NMR screening produces more efficiently binding ligands matching their binding site more perfectly than the more complex HTS ligands. The upper limit for ligand efficiency quoted by Kuntz [7] is by far not reached by ligands discovered by either fragment screening or classical HTS. In this context it is also worth mentioning that the loss of entropy related to the rigid body rotation and translation of a ligand is not linearly dependent of its size but can be treated as approximately constant having a value of 15-20 kJ/mol [31]. This penalty must be compensated by the a part of the binding interactions of the ligand, and only the additional binding interactions contributes to the overall ligand efficiency. In smaller ligands a larger relative portion of the binding enthalpy generated by the ligand-protein interactions is required to compensate the rigid body entropy loss, since there are overall fewer binding interactions. Therefore, to achieve the same ligand efficiency a smaller ligand needs in fact to have more binding interactions per non-H-atom than a larger ligand.
Schuffenhauer et al.
LIBRARY DESIGN: GENERAL PRINCIPLES Considering the design of fragment based screening libraries, it is clear from the discussion above, that the ligands should be of a lower complexity than the molecules typically screened in HTS. Since the throughput of all FBS methods, with the exception of biochemical screening, is rather low, typically only about 500-1000 molecules per target are screened, requiring a careful library design. An analysis of the examples given in [32, 33] with regard to the selection of the initial fragments to be screened reveals the approximately half of the ligands were discovered by screening a generic library while the other half resulted from the screening of a target focused library (Fig. 3).
Fig. (3). Methods for discovering the initial active fragment used in the examples for fragment based drug discovery given in [32, 33]. In almost half of the examples a generic fragment library was screened.
The basic principles for the design of fragment libraries screening have been discussed in several publications [21, 34, 35] and have also been applied in the first generation of our screening library. To limit the complexity of the fragments Congreve [36] introduced on the basis of Lipinski’s rule-of-five [37] the rule-of-three, stating that the molecular weight of screening fragments should be about 300, and the logP