Protein network and mutations
The human proteome consists of a tightly regulated network of proteins, where specific sites on these proteins allow them to interact with other proteins, small (drug) molecules, antigens, or ligands in general. The result of such binding interaction is a conformational change of the target protein, where it takes on a – sometimes slightly – different 3D structural position, which allows the target protein to project new binding sites, or to transfer functional groups from one protein to another. In other words, the binding interaction activates or inhibits a specific function of the target protein, which enables the interaction with yet another protein, and so on.
The structural information of proteins is encoded by their respective DNA sequences, where a mutation of a single DNA base pair can result in the formation of malfunctioning and/or misfolded proteins. Mutations are not necessarily harmful, as some DNA mutations result in the exact same protein product (DNA replicas), and some result in the substitution of an amino acid (AA) into a similarly sized AA with similar electrochemical properties. However, when a DNA mutation produces an AA substitution localized in/near an important binding site of the protein, it may no longer recognize its native ligands and disturb the protein signalling network, causing toxic and oncogenic cell-regulation.
Abelson tyrosine kinase, cancer and small drug molecules
The protein Abelson tyrosine kinase (ABL) is involved in cell differentiation, cell division, cell adhesion and stress response â? . ABL transfers a phosphate group from ATP to different substrate proteins, and forms an important link in a larger network of cell cycle regulators. Some patients develop the BCR-ABL fusion oncogene, where part of chromosome 22 is translocated to chromosome 9, resulting in a modified DNA sequence that codes for a fusion oncoprotein BCR-ABL â? . The oncoprotein BCR-ABL retains the functional ability of ABL, in fact, it performs too well. Indeed, while ABL is normally inactive (not transferring phosphate), the oncoprotein BCR-ABL is always active (transferring phosphate). This leads to increased cell growth and -division, and results in chronic myeloid leukaemia (CML, cancer).
Small drug molecules have been developed that artificially put BCR-ABL in the inactive state, by occupying the ATP-binding site of ABL. As such, ATP can no longer bind to ABL, and there is no phosphate group to be transferred. The first developed small drug molecule was imatinib, which proved to be very successful in combating CML, and is regarded as the first case of targeted drug design. Unfortunately, some patients have mutations near the ATP-binding site of ABL, which block the binding of imatinib. To this end, second generation drug molecules were developed. These were blocked again by mutations, spurring the development of third generation drug molecules.
Figure 1: Close-up of ATP binding site of ABL kinase domain, with bound ATP (left, yellow) and bound imatinib (right, magenta). Imatinib inactivates ABL by competitive occupation of the ATP binding site. Important flexible loops are colored red (glycine-rich loop) and blue (activation loop).
Molecular dynamics for targeted drug design
While experimental techniques provide excellent methods to study proteins, they are time-consuming and expensive. Moreover, probing the proteins’ extremely fast configurational changes and interactions at the atomic scale remains challenging. To this end, computational techniques have been developed for high-throughput applications (e.g. protein-ligand docking in targeted drug design), and molecular dynamics (MD) simulations that provide the fully atomistic picture of proteins.
The main goal of the master thesis is to chart the conformational landscape of ABL when clinically relevant AA mutations are introduced. Knowledge of the important conformations of the mutated ABL proteins allows the design of compatible drug molecules. As the drug molecules bind to specific conformations of ABL, binding activity will depend mainly on whether or not these conformations can be obtained by the mutated ABL protein. Recently, sixteen metastable conformations of native (unmutated) ABL were obtained using a Markov State Model (MSM) â? . The structural information of these conformers is publicly available in PDB format, and will serve as initial structures for the thesis.
In a first step, the least-energy structures of each conformer will be obtained using relatively short MD simulations. Next, clinically relevant mutations will be introduced in each conformer using the MODELLER software, after which the mutated conformers will be simulated with multiple longer MD simulations. The resulting trajectories will then be analyzed to determine which of the original conformers remain viable for the mutated variants. This can initially be done by a 16-fold RMSD calculation to the original conformers, but it is likely that a novel descriptor will be needed for effective classification of the trajectories. For the latter, the student is encouraged to find inspiration in machine learning literature to construct an elegant classifier.
Figure 2: Different conformations of native ABL. The activation loop can be in an active state (light blue) and inactive state (dark blue). The glycine-rich loop can similarly be in an active state (red) and inactive state (orange). The availibility of conformations in a mutated ABL protein will determine which drug molecules can bind to it. For example, imatinib is known to bind conformations with inactive activation loop.
While the above workflow allows to determine which conformations are energetically stable for the mutated ABL variants, it does not reveal which specific conformations are dominant for those ABL variants. Indeed, the simulations initialized in certain conformations will relax to the closest local energetically favourable conformation. Since important conformational transitions happen on a time-scale that is inaccesible to brute-force MD simulations, the global energetic minimum will not be found. If time allows, the life times of the conformers can be calculated with biased MD simulations, where the previously defined classification descriptors can be used as biasing coordinate.
 J. Y. J. Wang, “The Capable ABL: What Is Its Biological Function?,” Mol. Cell. Biol., vol. 34, no. 7, pp. 1188–1197, 2014, doi: 10.1128/mcb.01454-13.
 A. Hai, N. A. Kizilbash, S. H. H Zaidi, J. Alruwaili, and K. Shahzad, “Differences in structural elements of Bcr-Abl oncoprotein isoforms in Chronic Myelogenous Leukemia,” Bioinformation, vol. 10, no. 3, pp. 108–114, 2014, doi: 10.6026/97320630010108.
 Y. Meng et al., “Predicting the Conformational Variability of Abl Tyrosine Kinase using Molecular Dynamics Simulations and Markov State Models,” J. Chem. Theory Comput., vol. 14, no. 5, pp. 2721–2732, 2018, doi: 10.1021/acs.jctc.7b01170.