25223 Kinetic profiling of protein-drug binding using non-equilibrium molecular dynamics simulations
Begeleider(s): Wouter Vervust

Richtingen: Master of Science in Engineering Physics


Targeted drug design: thermodynamic and kinetic selectivity

The human proteome consists of a tightly regulated network of proteins, where specific sites on these proteins allow them to interact with other proteins, small (drug) molecules, antigens, or ligands in general. The result of such binding interaction is a conformational change of the target protein, where it takes on a – sometimes slightly – different 3D structural position, which allows the target protein to project new binding sites, or to transfer functional groups from one protein to another. In other words, the binding interaction activates or inhibits a specific function of the target protein, which enables the interaction with yet another protein or ligand, and thus forms one branch of a signaling cascade in the protein network. 

As the name reveals, targeted drug design deals with the development of ligands that bind to a target protein, to inhibit or activate functionality depending on the toxic nature of the protein. Generally, the structure of the target protein and its binding sites are known a priori, and one searches for compatible ligands. In the last decades, the quest thereof has been dominated by structural and thermodynamical considerations, where candidate ligands are ranked by their equilibrium constants Kd. This constant, also known as the dissociation constant, describes the fraction of target-proteins that would be bound to the ligand at equilibrium conditions, and in particular, at constant ligand concentration. A schematic picture of protein-ligand binding is given in Figure 1, where it is seen that the equilibrium constant is given by the ratio of the kinetic rate constants koff and kon. From physical chemistry, it is known that Kd is proportional to the Boltzmann constant associated with the free energy difference of the bound and unbound states (Figure 1, left).

Figure 1: Left: schematic picture of the Protein-Ligand (P-L) binding free energy profile. The equilibrium constant Kd is fully described by the free energy difference deltaG of the unbound and bound states, describing the strength of the P-L bound complex. For this reason, Kd has served as the dominant parameter in targeted drug design. However, the binding reaction is almost always accompanied by an energy barrier which is often not negligible, and can highly impact the effectiveness of the candidate ligand in vivo (see text). Right: differential equation accompanying the bimolecular binding reaction, where [P] and [L] denote protein and ligand concentrations, respectively.

While it is true that Kd is a good indicator of binding strength, it has major shortcomings that prove to be detrimental: more than 90% of promising drug molecules fail human trials by off-target toxicity and/or bad overall target selectivity. Indeed, thermodynamic selection based on Kd neglects the existence of a free energy barrier that needs to be overcome during the binding reaction (Figure 1, left). While the final bound state at the target protein may be energetically most favourable, the energy barrier may be sufficiently high that most drug molecules bind to a – potentially toxic – off-target protein with less favourable bound state, but lower energy barrier. To this end, the notion of kinetic selectivity has gained a lot of attention in the latest years, where one tries to obtain the rate constants kon and koff. These parameters hold information about kinetic properties of the protein-ligand interaction, such as the drug residence-time, allowing better profiling of candidate drug molecules and saving costly and time-consuming in vivo experiments.

Molecular dynamics for protein-ligand binding

While experimental techniques provide excellent methods to study proteins, they are time-consuming and expensive. Moreover, probing the proteins’ extremely fast configurational changes and interactions at the atomic scale remains challenging. To this end, computational techniques have been developed for high-throughput applications (e.g. protein-ligand docking in targeted drug design), and molecular dynamics (MD) simulations that provide the fully atomistic picture of proteins. However, binding/residence times are typically multiple milliseconds, seconds, minutes or even hours, while conventional MD simulations are rarely longer than 1 millisecond. The binding interaction is a so-called rare event in simulations, and makes the study of time-dependent parameters (i.e. kinetics) a difficult subject. This thesis will use and develop advanced methodological simulations tools that accelerate the binding process, such that enough binding events are simulated and the significant statistics are achieved at computationally reasonable cost. Principles from statistical physics will be used to retrieve the true dynamics of the binding process as if no acceleration was applied.



The main objective of this thesis is the derivation of the kinetic rate constants of protein-ligand binding. Recently, two methods were developed to extract both the thermodynamics and kinetics of rare events: the Transition-based Reweighting Analysis Method (TRAM) [1]⁠, and the Dynamic Histogram Analysis Method Extended to Detailed balance (DHAMed) [2]⁠. Both methods are based on a Markov State Model, where the simulation trajectory data is discretized in bins along a predefined reaction coordinate (DHAMed), or the simulation trajectory data is discretized in predefined phase space states (TRAM). Both methods deliver an improvement to earlier methods regarding Markovianity, i.e. the assumption that each transition between bins/states depends only on the bin/state it is currently inhabiting, and not on its history. While the Markovian assumption is clear, the implementation thereof is not, and the reason why is explained next.

Since full exploration of the protein’s phase space is impossible (3N-6, N > 1000), the simulations are biased to explore the regions of interest, by introducing a bias term F_bias to the Hamiltionian of the system. For instance, in Umbrella Sampling (US), harmonic potentials (umbrellas) are added along a predefinied reaction coordinate, such that consecutive simulations are forced to sample different regions along this binding reaction coordinate. The ensemble of simultions thus captures the entire binding trajectory, after which these biased (unphyiscal) simulations need to be 'unbiased' before statistical analysis can be applied, and the true physical dynamics retrieved.

The reaction coordinate can be more complicated than just a distance parameter, but it will essentially be an extremely low dimensional subset of the available degrees of freedom (DoFs). As the next biased simulation is initialized using the previous biased simulation, the  resulting trajectory may be forced along a specific binding trajectory that is favorable using the predefined reaction coordinate. Indeed, the simulations will not capture those binding pathways that present energy barriers in the DoFs that are not captured by the reaction coordinate. This introduces a problem with the aforementioned Markov assumption: transitions become dependent on their binding pathway, and no longer solely depend on their position along the reaction coordinate. Both TRAM and DHAMed try to overcome this sampling problem by adapting the Markov Model to incorporate ‘local’ equilibrium sampling instead of ‘global’ equilibrium sampling, as was used in older methods like WHAM and MBAR.

The DoFs that are orthogonal to the reaction coordinate are also called orthogonal degrees of freedom, and they pose a challenging problem in the computational calculation of binding kinetics, as well as protein folding rates, diffusion through membranes or porous materials, etc.

This thesis will use the Abelson tyrosine kinase (ABL) as a test case to study the role of orthogonal DoFs (Figure 2). This protein is involved in cell differentiation, cell division, cell adhesion and stress response [3]⁠. Some patients develop the BCR-ABL fusion oncogene encoding the fusion oncoprotein BCR-ABL [4]⁠. While ABL is normally inactive, the oncoprotein BCR-ABL is always active, leading to increased cell growth and -division, resulting in chronic myeloid leukaemia. Small drug molecules (e.g. imatinib) have been developed that artificially put BCR-ABL in the inactive state, by occupying the ATP-binding site of ABL. 

For the ABL-imatinib binding complex, these aforementioned orthogonal DoFs are present in the form of (1) water molecules, the specific location of which can change the binding trajectory by its hydrogen bonds, and (2) the multitude of metastable ABL conformations, some of which do not allow imatinib binding, and some of which that allow different binding orientations. Some ABL conformations are given in Figure 2, as well as imatinib bound to ABL.

Figure 2: left: imatinib (magenta) bound to ABL. Right: different conformations of ABL, where important flexible loops are colored red/orange (P-loop, active/inactive) and blue/seablue (A-loop, inactive/active)

The student will get familiar with the theoretical concepts of biased simulations and Markov state models and will derive a protocol to extract the binding dynamics from these biased simulations using DHAMed/TRAM. Depending on the creative interest of the student, a new methodology tailored for the treatment of the orthogonal DoFs in binding kinetics can be developed. The student will run and analyze advanced molecular dynamics simulations on the ABL-ligand complex to test whether the binding kinetics can be retrieved with the new protocol.


[1] H. Wu, F. Paul, C. Wehmeyer, and F. Noé, “Erratum: Multiensemble Markov models of molecular thermodynamics and kinetics (Proc Natl Acad Sci USA (2016) 113 (E3221–E3230) DOI: 10.1073/pnas.1525092113),” Proc. Natl. Acad. Sci. U. S. A., vol. 114, no. 31, p. E6475, 2017, doi: 10.1073/pnas.1711690114.

[2] L. S. Stelzl, A. Kells, E. Rosta, and G. Hummer, “Dynamic Histogram Analysis To Determine Free Energies and Rates from Biased Simulations,” J. Chem. Theory Comput., vol. 13, no. 12, pp. 6328–6342, 2017, doi: 10.1021/acs.jctc.7b00373.

[3] J. Y. J. Wang, “The Capable ABL: What Is Its Biological Function?,” Mol. Cell. Biol., vol. 34, no. 7, pp. 1188–1197, 2014, doi: 10.1128/mcb.01454-13.

[4] A. Hai, N. A. Kizilbash, S. H. H Zaidi, J. Alruwaili, and K. Shahzad, “Differences in structural elements of Bcr-Abl oncoprotein isoforms in Chronic Myelogenous Leukemia,” Bioinformation, vol. 10, no. 3, pp. 108–114, 2014, doi: 10.6026/97320630010108.