Showing posts with label docking. Show all posts
Showing posts with label docking. Show all posts

Wednesday, October 8, 2008

Making a good water model: Molecules do conformationally change when cross from a gas to water solution

Solvation energy calculation is absolutely crucial for a successful binding free energy (IC50) determination. Quantum Pharmaceuticals develops aqueous solvation models and tests them against available experimental data to validate the theoretical approaches.

The graph on the left represents two types of solvation energy calculations compared with experiments. The first series (small circles) are the energy differences on solvation for a set of molecules without conformational changes taken into account. The second set (large squares) is obtained after a single optimization run.

The correlation with the experiment clearly improves after conformational changes calculations. Apparently this does not only mean that the model is good, it also means that the molecules do change structure when inserted into water from the gas phase.

Friday, June 6, 2008

Docking validation study: PDK1-kinase (oncology)

Following the classic thrombine study, we catch up with a more important target: PDK1 kinase.

Pyruvate dehydrogenase kinase, isozyme 1, also known as PDK1, is a human gene.It codes for an isozyme of pyruvate dehydrogenase kinase (PDK).Pyruvate dehydrogenase (PDH) is a part of a mitochondrial multienzyme complex that catalyzes the oxidative decarboxylation of pyruvate and is one of the major enzymes responsible for the regulation of homeostasis of carbohydrate fuels in mammals. The enzymatic activity is regulated by a phosphorylation/dephosphorylation cycle. Phosphorylation of PDH by a specific pyruvate dehydrogenase kinase (PDK) results in inactivation.

There are no as much known inhibitors as for thrombine. BindingDB gives a few more than 70 compounds with measured binding affinities, all relatively strong binders, many of them similar to each other. We run our QUANTUM software to perform docking and the affinity calculations. The results are represented on the graph and demonstrate a solid correlation. In fact the correlation shows QUANTUM's ability to identify strong binders and distinguish between similar compounds (selectivity).

Thursday, June 5, 2008

Docking validation study: classic example, thrombine

The Figure on the left represents a docking study of more than 200 molecules with known activity on thrombin. The protein is a well known target ....

We have extracted the binding data from the BindingDB database and docked all the molecules onto a single (of a few available) 3D structure (2cn0 from the pdb databank).

The figure represents graphically the results of the research. The calculated and the measured activities are well correlated. Strong binders are indeed identified as strong binders (left bottom part of the graph). The accuracy of the predictions is quite good (see our discussion on the quality of the biological data here and here).

The results of the calculations can be conveniently summarized in terms of confidentiality matrix. Normally a first screen of novel compounds is performed at a certain concentration to distinguish between the active and non-active compounds. Let's take a standard, 1muM (~-35kJ/M) activity, as a separation cut-off. Then the confidence matrix has the following elements:
  • Experimentally active, Predicted active: 29 molecules
  • Experimentally n-active, Predicted active: 15 molecules (false positives)
  • Experimentally active, Predicted n-active: 8 molecules (false negatives)
  • Experimentally n-active, Predicted n-active: 156 molecules

Monday, May 26, 2008

Protein Flexibility and False Positives detection.

Standard hit identification procedure with QUANTUM software implies screening of a large compound library against a given protein target. An example of such procedure for a small set of compounds with known activities is discussed in another blog entry.

Let us show first that a calculation with flexible protein gives a reasonable prediction of the binding free energy. To do that we selected a set of ~200 protein - ligand complexes from the BindingDB database. The protein-ligand pairs were selected mainly so that the complex is small and therefore the whole calculation is fast. The results are represented on the Figure. The horizontal and the vertical axis represent the calculated and the experimental value of the binding free energy calculated from the complexed positions of the ligand within the protein.

The correlation is clearly there and in a few days I will show that the calculated values demonstrate not only the accuracy, but also a good selectivity.

The other Figure represents the correlation between the results of rigid receptor fast docking procedure (horizontal axis) and the fully flexible binding free energies (vertical axis). Although the rigid protein force field has a decent correlation, it fails to recognize electrostatic clashes and thus leads to a fairly large amount of false positives among the predicted ligands. Only about 10% of all the ligands, all originally predicted in the muM range survives as binders. The trend is also clear: all the binding energy values increase (fully flexible force field gives less binders than the rigid calculation would suggest).

Wednesday, May 21, 2008

Computer aided drug design video from Quantum Pharma


Molecular modelling software of Quantum Pharmaceuticals is used to dock small molecule to active site of target protein. The molecular docking on flexible protein is explored. The Quantum docking software is available for free use at LeadFinding.com, the online hit-to-lead optimization service to filter and profile chemical compounds in chemical database of ChemDiv - organic chemistry supplier.

Monday, December 10, 2007

EMD Serono, Inc licensed Quantum Pharmaceuticals’ drug discovery technology.


Moscow, 10 December, 2007

EMD Serono, Inc entered license agreement with Quantum Pharmaceuticals to get an access to Quantum Pharmaceuticals’ small molecule hit identification computational platform and apply it in in-house research.

The Quantum Pharmaceuticals’ industry leading computational drug design technologies is based on applying quantum, molecular and statistical physics in molecular modeling and was successfully applied in different drug discovery projects. The initial term of the agreement is one year. The financial terms of the agreement were not disclosed.

Quantum Pharmaceuticals is a drug discovery company based in Moscow, Russia specializing in small molecule screening and design through the use of its proprietary technology platform.

EMD Serono, Inc. and Merck Serono S.A. are affiliates of Merck KGaA, Darmstadt, Germany, with over 16,000 employees worldwide and a strong presence on all continents.

Friday, December 7, 2007

How good are biological experiments? HERG binding data analysis


A correlation between predicted and expermentally measured values of biological activity is a natural measure of a model quality. For instance, QUANTUM docking software calculates binding free energies, which are directly comparable with experimental values of -p(binding constant, Kd). Root mean squared error between the measured and the calculated quantities is the quantitative measure of the software performance.
Whatever the correlation is presented to prove the validity of a model, another important issue is the quality of the experimental data itself. The reported values for binding constants (or activities) often vary because of different measurement strategies, experimental errors or interpretation uncertanties. To visualize the situation we investigated a few datasets for HERG binding taken from QSAR World website.
The downloaded files were saved in source folder and processed with the following simple python script (thanks to openbabel):
files = os.listdir('source/')
molecules = []
for file in files:
molfile = readfile("sdf",'source/'+file)
for mol in molfile:
molfp = mol.calcfp()
present = 0
for savedmol in molecules:
savedmolfp = savedmol.calcfp()
if (molfp | savedmolfp == 1):
present = 1
print mol.data, savedmol.data
if (not present):
molecules.append(mol)

The results where analyzed in a spreadsheet program and represented on the graph above. A lot of molecules occur multiple times in the datasets. While in many of the cases the activities coinside up to 0.01 (which most probably indicates citing from a single source), the remaining values thouch correlated with each other, differ by roughly a single pKd unit.



Monday, September 3, 2007

QUANTUM and albumin binding calculations: the role of protein flexibility


Drug distribution within the body is determined mainly by free (unbound) concentration of drug in circulating plasma. The unbound fraction, in turn, depends on drug absorption by plasma proteins. Human Serum Albumin (HSA) is the most abundant blood plasma protein and is produced in the liver.

Binding of a compound to HSA results in an increased solubility in plasma, decreased toxicity, and /or protection against oxydation of the bound ligand. Binding can also have a significant impact on the pharmacokinetics of drugs, e.g. prolonging in vivo half life of the therapeutic agent. However too strong binding prevents drug release in tissues. That is why HSA binding information is one of the key characteristics of a compound determining its ADME properties. Successful in silico calculation of HSA binding could provide a structural basis of drug derivatives with altered HSA-binding properties.

HSA has at least two main drug binding sites characterized as Sudlow site I and Sudlow site II, which bind a number of drugs selectively. Site I, also known as the warfarin binding site, is formed by a pocket in subdomain IIA of HSA. Site II is located in subdomain IIIA and is known as the benzodiazepine binding site. Ibuprofen and diazepam are selectively bind to site II. Multiple active sites make HSA a complicated target for structure-based modeling.

The ligands with known HSA binding affinities were taken from and prepared with a set of built in QUANTUM molecular preparation and processing tools. Acenocoumarol, Acetylsalicilic_acid, Azapropazone, Benzylpenicillin, Benzylthiouracil, Bilirubin, Canrenoate, Carbamazepine, Carbenicillin, Chlorpropamide, Diphenylhydantoin, Furosemide, Indomethacin, Methyl_p-hydroxybenzoate, N-acetyl-L-tryptophan, Oxyphenbutazone, Phenobarbital, phenyl_salicylate, Phenylbutazone, Piretanide, Propyl_p-hydroxybenzoate, Quercetin, Salicylate, Sodium_benzoate, Spironolactone, Sulfadimethoxine, Sulfamethizole, Sulfathiazole, sulfisoxazole, Tenoxicam, Tolbutamide, Warfarin, Carprofen, Chlofibrate, Iopanoate, L-tryptophan were docked on to both of the sites. Three different structures – 2BXH, 2BXF and 2BXG were taken for docking. 2BXH was used for docking to the first binding site, 2BXF and 2BXG have different structures of the binding site II and we decided to use both structures for docking. Docking grid 20x20x20A were centered around a central ligand atom in appropriate binding site.

The results of the docking runs are summarized on the Figure. Both binding sites show considerable flexibility, molecular dynamics and the binding free energy calculations with flexible protein (large red points) lead to remarkable improvement of the predicted values of HSA binding affinities with respect to experiment (smaller blue points represent the results of docking on a rigid protein model). Since QUANTUM model does not have training parameters the presented correlation proves QUANTUM abilities to predict HSA binding constants of druglike compounds to both of the active sites.

Tuesday, May 15, 2007

QUANTUM free energy vs. statistical scoring functions

QUANTUM employs quantum mechanics, thermodynamics, and an advanced continuous water model for solvation effects to calculate ligands binding affinities. This approach differs dramatically from scoring functions that are commonly used for binding affinity predictions. By including the entropy and aqueous electrostatics contributions in to the calculations directly, QUANTUM algorithms produce much more accurate and robust values of binding free energies.
Interaction of a ligand with a protein is characterized by the value of binding free energy. The free energy (F) is the thermodynamic quantity, that is directly related to experimentally measurable value of inhibition constant (IC50) and depends on electrostatic, quantum, aqueous solvation forces, as well as on statistical properties of interacting molecules. There are two major contributing quantities leading to non-additivity in F: 1) the electrostatic and solvation energy, and 2) the entropy.
Most of popular scores employ a reasonable approximation for short-range quantum interactions, but do not perform a detailed calculation of aqueous electrostatics and entropy. Both the solvation energy and the entropy are difficult to compute: so instead of exact computations, scoring function use an approximation. In this approximation, the contributions of non-additve properties are estimated as fractions of easier-to-calculate pairwise interactions: electrostatics and van der Waals forces.
Such approximation works to a some extent because vacuum electrostatics is nearly canceled by solvation energies; at the same time the enthalpy of binding is approximately compensated by entropy. Therefore, the calculations of solvation energies and entropy seemingly can be avoided by combining molecular mechanics, van der Waals and electrostatic forces linearly with usually small numerical coefficients (of the order of 10%).
However, a potential energy surface given by such linear combinations of unrelated quantities with statistics-based coefficients is not necessarily related to the true interaction profile. That is why such a simple score fails frequently to reproduce unique binding modes and hence gives docking false negatives. In the same time, this approximation tends to overestimate the affinity of weak binders producing docking false positives. Thus, in spite of reasonable accuracy of such predictions, the selectivity of scoring function is low. This means that frequently scoring functions will not allow to identify really strong binder among the pool of similar weak binders. Moreover, affinities of weak binders may be overestimated.
QUANTUM software does not rely on approximate cancellations of important physical quantities. Instead, we employ our continuous water model to compute both the vacuum and aqueous electrostatic energies, use quantum mechanics to calculate the short range forces and thermodynamic sampling to obtain the value of the free energy (entropy). As the result we can not only observe the necessary subtractions of individual energy components, but also perform molecular modeling in a more realistic and physically justified potentials. Fig. 1 shows the results of a docking run on a single rigid protein structure for 220 different ligands. R.m.s. error in free energy is 2kJ/mol, the correlation coefficient is 0.7.
The selectivity improvement in our approach is illustrated by the following model calculation. First, we derived a simple linear model directly from our QUANTUM vacuum force field. The short range interactions where variationally adjusted (to allow for empirical hydrogen bonds). The van der Waals “scaling factors” and “protein dielectric constant” were found by correlating the suggested score with experimental binding affinities for a number of known complexes. Both the linear score and complete QUANTUM force field were tested using 300 protein-ligand pairs and showed comparable accuracy.
Fig. 2 shows docking funnels (the energy difference between the conformers plotted as a function of r.m.s. from the known crystallographic position of a ligand). Both full QUANTUM free energies (red lines) and scoring function (blue lines) were used to calculate, in the case of QUANTUM, or estimate, in the case of linear score, the free energies for numerous binding modes (conformers). The conformers where generated by the QUANTUM 3.3 docking program. The resulting energies were averaged over the conformers with similar r.m.s. values and plotted on the same graph.
The model calculation shows that the QUANTUM force field has a much steeper docking funnel, i.e. is more sensitive to misplacements of a ligand, than a scoring function. Therefore QUANTUM complete force field can be used to distinguish similar binding modes and hence obtain much more accurate docking positions. As a result, QUANTUM shows dramatically lower ratio of false positives and false negatives as compared with a scoring function based method.
In fact, non-additive interactions (especially solvation effects) play a key role in molecular recognition of small molecules by proteins. QUANTUM software is practically the only accurate and highly sensitive method available to a broad audience of researchers, which is capable to realistically model the intermolecular interactions.

Thursday, April 19, 2007

Docking selectivity: Additive vs. non-additive force field

The free binding energy (F) of a small molecule and a protein is a non-additive complex function of individual interatomic interactions. There are two major contributing quantities leading to non-additivity in F: the electrostatic energy, and the entropy.

A common approach to molecular docking is to develop a simple (generally additive) model of intermolecular interactions and train it to reproduce experimental values of binding energy for a range of known inhibitors. The obvious advantage is the calculations speed.

Any more sophisticated approach takes requires more computational resources and may hardly lead to an immense improvement in calculations accuracy. The question thus is: do non-additive force fields have any advantages in docking situtations?

To investigate the issue the following experiment was performed:
  1. A simple force Molecular Mechanics (MM) force field, containing a reasonable approximation for (distant-dependent media polarization) electrostatics, van-der-waals and hydrogen bonding, was developed.
  2. Same van-der-waals and hydrogen bonds were paired with vacuum electrostatics and a (non-additive) water model to simulate solvation effects.
Both models were transformed into simple linear regressions (to avoid sophisticated thermodynamic integration) and trained to reproduce the same amount of experimental values.

Although the two models show roughly the same level of accuracy in predicting the binding energies, the selectivity is drustically different. The figure above shows the results of the energy calculations for a set of few hundreds decoys. Both graphs represent the relative binding energy (counted from the minimum position) vs. the r.m.s. deviation of the calculated ligand positions from the known native position. The lines are the energy offset values averaged over the conformers (decoys) with similar r.m.s. positions.

The conclusion?
The solvation energy is the major source of non-additivity in ligand binding. Though being one of the most complicated quantities to be acounted properly in a docking run, the solvation energy is one of the major mechanisms of molecular recognition

Friday, January 19, 2007

What makes a correct binding energy calculation?

A correct calculation of a ligand binding affinity requires proper account of a few large contributions at the same time. The final value of the binding free energy is the sum of multiple sign-alternating entries: electrostatic (ES), solvation (ESolv), exchange-van der Waals (vdW) and entropic (TdS) term.

Common wisdom claims that ES and ESolv nearly compensate each other (up to a few h-bond energies), the remaining (normally large) vdW contribution is nearly canceled by TdS term. In fact, there is another important anti-correlation: strongly burried ligands with large vdW contributions are normally both entropycally and electrostatically constrained: The deeper a ligand gets into a protein (low dielectric constant medium) from water (high dielectric costant medium), the more the dessolvation energy (ES+ESolv) is, the larger the entropy losses normally are.


To clarify this issue, a simple calculation with 32 pdb structures with knows binding affinities was performed (1AU0, 1AU3, 1AU4, 1BR5, 1BR6, 1C83, 1C84, 1C85, 1C86, 1C87, 1EFY, 1EJN, 1JIJ, 1JIK, 1MS6, 1N3W, 1NAX, 1NLI, 1PWY, 1Q6K, 1RRI, 1RRW, 1RRY, 1RS2, 1TOW, 1TSM, 1URG, 1V2M, 2CBR, 2CBS, 3CBS, 4ERK). The binding affinities were taken from the PDBBind database. For every structure both the complex, the ligand and the protein was initially optimized in QUANTUM force field, then a thermodynamic integration was used to establish the free binding energies. The correlation between the calculated and measured (reported in the literature) values is shown on the left Figure 1. The free binding energies are reported in kJ/mol.


First we check the sanity of our solvation energy calculations. The following graph represents the anti-correlation between the electrostatic energy ES and the solvation contribution ESolv (both polar and non-polar, no additional dielectric constant for the protein was used since the protein is fully flexible). The two values nearly cancel each other, as it should be the case.
The only physically meaningful combination is the dessolvation punishment ESolv+ES, which is plotted against the vdW contribution (see Figure 3).


Remarkably, the dessolvation energy anti-correlates with the vdW term, which is a good measure of the contact surface. In fact, many scoring functions are pair sums of relatively short range potentials and hence correlate well with the contact surface. Non-additive desolvation energy does not correlate with the contact terms and (together with entropy losses) provides a limitation on binding energy for large and deeply burried ligands. TdS depends on the details of interactions in a smooth logarithmic way, which means that the dessolvation punishment is a leading effect.

The whole Molecular Mechanics (MM) Hamiltonian was used to generate MM trajectories for thermodynamic integration for each of the complexes. The results of the calculations are represented on the Figure 1. Root mean squared error of the calculations 6.1 kJ/mol. The dessolvation energy is a big contribution and removes (or better to say improves over) the correlation between the binding free energy and contact surface.
Conclusion 1: Binding free energy of a ligand is not a contact surface (or anything close to that approximable with a sum of pairwise short-range potential)
Conclusion 2: Desolvation energy is a major limiting factor restricting the binding energy of large ligands.