Showing posts with label false postive. Show all posts
Showing posts with label false postive. Show all posts

Tuesday, January 13, 2009

Self-consistent solvation energy contribution calculation for protein-ligand complexes




Solvation energy is a major contribution to a ligand binding energy and is the interaction pretty much responsible for binding selectivity. Actual calculation of the solvation energy requires a method valid both for small molecule ligands and large proteins (and protein-ligand complexes).


Calculation of the electrostatic contributions for the binding energies in a continuous solvation energy approach imposes different problems for large and small molecules. Normally people use some kind of Generalized Born (GB) approximation. The latter is only exact for a charge in the center of a spherical cavity and thus can only be valid for a small molecule with most of the charge located within a few atoms.

If the molecule of interest is large, most of the charges are close to the molecular surface, instead. GB approximation in its most commonly accepted form fails next to a molecular surface: the Born radius is missed by a factor of 2. This means that there can be no "classic" GB model working good both for small and large molecules!

Binding affinity calculation requires calculation of differences between the energy of the complex (a large molecule) and the energies of the protein (another large molecule) and the ligand (a small molecule) at infinite separation.

If a GB model is made working by careful adjusting of "bare" Born radii to fit experimental IC50 of complexes, a good sanity check would require reproduction of experimentally known solvation energies of small molecules and ions (and the other way around). The two graphs in this post show, that this is indeed possible. A relatively large error in the small molecules solvation energies shows that although the resulting model is reasonable, the obtained GB parameters are only quantitatively transferable between large and small molecules.

Monday, May 26, 2008

Protein Flexibility and False Positives detection.

Standard hit identification procedure with QUANTUM software implies screening of a large compound library against a given protein target. An example of such procedure for a small set of compounds with known activities is discussed in another blog entry.

Let us show first that a calculation with flexible protein gives a reasonable prediction of the binding free energy. To do that we selected a set of ~200 protein - ligand complexes from the BindingDB database. The protein-ligand pairs were selected mainly so that the complex is small and therefore the whole calculation is fast. The results are represented on the Figure. The horizontal and the vertical axis represent the calculated and the experimental value of the binding free energy calculated from the complexed positions of the ligand within the protein.

The correlation is clearly there and in a few days I will show that the calculated values demonstrate not only the accuracy, but also a good selectivity.

The other Figure represents the correlation between the results of rigid receptor fast docking procedure (horizontal axis) and the fully flexible binding free energies (vertical axis). Although the rigid protein force field has a decent correlation, it fails to recognize electrostatic clashes and thus leads to a fairly large amount of false positives among the predicted ligands. Only about 10% of all the ligands, all originally predicted in the muM range survives as binders. The trend is also clear: all the binding energy values increase (fully flexible force field gives less binders than the rigid calculation would suggest).

Tuesday, May 15, 2007

QUANTUM free energy vs. statistical scoring functions

QUANTUM employs quantum mechanics, thermodynamics, and an advanced continuous water model for solvation effects to calculate ligands binding affinities. This approach differs dramatically from scoring functions that are commonly used for binding affinity predictions. By including the entropy and aqueous electrostatics contributions in to the calculations directly, QUANTUM algorithms produce much more accurate and robust values of binding free energies.
Interaction of a ligand with a protein is characterized by the value of binding free energy. The free energy (F) is the thermodynamic quantity, that is directly related to experimentally measurable value of inhibition constant (IC50) and depends on electrostatic, quantum, aqueous solvation forces, as well as on statistical properties of interacting molecules. There are two major contributing quantities leading to non-additivity in F: 1) the electrostatic and solvation energy, and 2) the entropy.
Most of popular scores employ a reasonable approximation for short-range quantum interactions, but do not perform a detailed calculation of aqueous electrostatics and entropy. Both the solvation energy and the entropy are difficult to compute: so instead of exact computations, scoring function use an approximation. In this approximation, the contributions of non-additve properties are estimated as fractions of easier-to-calculate pairwise interactions: electrostatics and van der Waals forces.
Such approximation works to a some extent because vacuum electrostatics is nearly canceled by solvation energies; at the same time the enthalpy of binding is approximately compensated by entropy. Therefore, the calculations of solvation energies and entropy seemingly can be avoided by combining molecular mechanics, van der Waals and electrostatic forces linearly with usually small numerical coefficients (of the order of 10%).
However, a potential energy surface given by such linear combinations of unrelated quantities with statistics-based coefficients is not necessarily related to the true interaction profile. That is why such a simple score fails frequently to reproduce unique binding modes and hence gives docking false negatives. In the same time, this approximation tends to overestimate the affinity of weak binders producing docking false positives. Thus, in spite of reasonable accuracy of such predictions, the selectivity of scoring function is low. This means that frequently scoring functions will not allow to identify really strong binder among the pool of similar weak binders. Moreover, affinities of weak binders may be overestimated.
QUANTUM software does not rely on approximate cancellations of important physical quantities. Instead, we employ our continuous water model to compute both the vacuum and aqueous electrostatic energies, use quantum mechanics to calculate the short range forces and thermodynamic sampling to obtain the value of the free energy (entropy). As the result we can not only observe the necessary subtractions of individual energy components, but also perform molecular modeling in a more realistic and physically justified potentials. Fig. 1 shows the results of a docking run on a single rigid protein structure for 220 different ligands. R.m.s. error in free energy is 2kJ/mol, the correlation coefficient is 0.7.
The selectivity improvement in our approach is illustrated by the following model calculation. First, we derived a simple linear model directly from our QUANTUM vacuum force field. The short range interactions where variationally adjusted (to allow for empirical hydrogen bonds). The van der Waals “scaling factors” and “protein dielectric constant” were found by correlating the suggested score with experimental binding affinities for a number of known complexes. Both the linear score and complete QUANTUM force field were tested using 300 protein-ligand pairs and showed comparable accuracy.
Fig. 2 shows docking funnels (the energy difference between the conformers plotted as a function of r.m.s. from the known crystallographic position of a ligand). Both full QUANTUM free energies (red lines) and scoring function (blue lines) were used to calculate, in the case of QUANTUM, or estimate, in the case of linear score, the free energies for numerous binding modes (conformers). The conformers where generated by the QUANTUM 3.3 docking program. The resulting energies were averaged over the conformers with similar r.m.s. values and plotted on the same graph.
The model calculation shows that the QUANTUM force field has a much steeper docking funnel, i.e. is more sensitive to misplacements of a ligand, than a scoring function. Therefore QUANTUM complete force field can be used to distinguish similar binding modes and hence obtain much more accurate docking positions. As a result, QUANTUM shows dramatically lower ratio of false positives and false negatives as compared with a scoring function based method.
In fact, non-additive interactions (especially solvation effects) play a key role in molecular recognition of small molecules by proteins. QUANTUM software is practically the only accurate and highly sensitive method available to a broad audience of researchers, which is capable to realistically model the intermolecular interactions.