Fitting quantum machine learning potentials to experimental free energy data: Predicting tautomer ratios in solution

Marcus Wieder, Josh Fass, and John D. Chodera
Chemical Science, in press [bioRxiv] [code]

We demonstrate, for the first time, how alchemical free energy calculations can performed on systems simulated entirely with quantum machine learning potentials and how these potentials can be retrained on experimental free energies to generalize to new molecules from limited training data. We apply this approach to a difficult problem in small molecule drug discovery: Predicting accurate tautomer ratios in solution.

Best practices for alchemical free energy calculations

Mey ASJS, Allen B, Bruce Macdonald HE, Chodera JD, Kuhn M, Michel J, Mobley DL, Naden LN, Prasad S, Rizzi A, Scheen J, Shirts MR, Tresadern G, and Xu H.
Living Journal of Computational Molecular Sciences 2022 [DOI]
[arXiv] [GitHub]

This living review for the Living Journal of Computational Molecular Sciences (LiveCoMS) covers the essential considerations for running alchemical free energy calculations for rational molecular design for drug discovery.

Towards chemical accuracy for alchemical free energy calculations with hybrid physics-based machine learning / molecular mechanics potentials

Dominic A. Rufa, Hannah E. Bruce Macdonald, Josh Fass, Marcus Wieder, Patrick B. Grinaway, Adrian E. Roitberg, Olexandr Isayev, and John D. Chodera.
Preprint ahead of submission.
[bioRxiv] [GitHub]

In this first use of hybrid machine learning / molecular mechanics (ML/MM) potentials for alchemical free energy calculations, we demonstrate how the improved modeling of intramolecular ligand energetics offered by the quantum machine learning potential ANI-2x can significantly improve the accuracy in predicting kinase inhibitor binding free energy by reducing the error from 0.97~kcal/mol to 0.47~kcal/mol, which could drastically reduce the number of compounds that must be synthesized in lead optimization campaigns for minimal additional computational cost.

Crowdsourcing drug discovery for pandemics

John D. Chodera, Alpha A. Lee, Nir London, and Frank von Delft.
Nature Chemistry 12:581, 2020
[DOI] [PDF] [COVID Moonshot] [GoFundMe]

The COVID-19 pandemic has left the world scrambling to find effective therapies to stem the tidal wave of death and put an end to the worldwide disruption caused by SARS-CoV-2. In this Correspondence, we argue for the need for a new open, collaborative drug discovery model (exemplified by our COVID Moonshot collaboration) that breaks free of the limitations of industry-led competitive drug discovery efforts that necessarily restrict information flow and hinder rapid progress by prioritizing profits and patent protection over human lives.

Is structure based drug design ready for selectivity optimization?

Steven K. Albanese, John D. Chodera, Andrea Volkamer, Simon Keng, Robert Abel, and Lingle Wang
Journal of Chemical Informatics and Modeling [DOI] [bioRxiv] [GitHub]

We asked whether the similarity of binding sites in related kinases might result in a fortuitous cancellation of errors in using alchemical free energy calculations to predict kinase inhibitor selectivities. Surprisingly, we find that even distantly related kinases have sufficient correlation in their errors that predicting changes in selectivity can be much more accurate than predicting changes in potency due to this effect, and show how this could lead to large reductions in the number of molecules that must be synthesized to achieve a desired selectivity goal.

Machine learning force fields and coarse-grained variables in molecular dynamics: application to materials and biological systems

Gkeka P, Stoltz G, Farimani AB, Belkacemi Z, Ceriotti M, Chodera JD, Dinner AR, Ferguson A, Maillet JB, Minoux H, Peter C, Pietrucci F, Silveira A, Tkatchenko A, Trstanova Z, Wiewiora R, Leliévre T.
Journal of Chemical Theory and Computation 60:6211, 2020. [DOI] [arXiv]

We review the state of the art in applying machine learning to coarse grain force fields in space and time to study mutliscale dynamics.

Standard state free energies, not pKas, are ideal for describing small molecule protonation and tautomeric states

M R Gunner, Taichi Murakami, Ariën S. Rustenburg, Mehtap Işık, and John D. Chodera.
Journal of Computer Aided Molecular Design 34:561, 2020. [DOI] [PDF] [GitHub]

Here, we demonstrate how the physical nature of protonation and tautomeric state effects means that the standard state free energies of each microscopic protonation/tautomeric state at a single pH is sufficient to describe the complete pH-dependent microscopic and macroscopic populations. We introduce a new kind of diagram that uses this concept to illustrate a variety of pH-dependent phenomena, and show how it can be used to identify common issues with protonation state prediction algorithms. As a result, we recommend future blind prediction challenges utilize microstate free energies at a single reference pH as the minimal sufficient information for assessing prediction accuracy and utility.

Assessing the accuracy of octanol-water partition coefficient predictions in the SAMPL6 Part II log P Challenge

Mehtap Işık, Teresa Danielle Bergazin, Thomas Fox,  Andrea Rizzi, John D. Chodera, and David L. Mobley.
Journal of Computer Aided Molecular Design, 34:335, 2020. [DOI] [PDF] [bioRxiv] [GitHub]

We report the performance assessment of the 91 methods that were submitted to the SAMPL6 blind challenge for predicting octanol-water partition coefficient (logP) measurements. The average RMSE of the most accurate five MM-based, QM-based, empirical, and mixed approach methods based on RMSE were 0.92±0.13, 0.48±0.06, 0.47±0.05, and 0.50±0.06, respectively.

The SAMPL6 SAMPLing challenge: Assessing the reliability and efficiency of binding free energy calculations

Andrea Rizzi, Travis Jensen, David R. Slochower, Matteo Aldeghi, Vytautas Gapsys, Dimitris Ntekoumes, Stefano Bosisio, Michail Papadourakis, Niel M. Henriksen, Bert L. de Groot, Zoe Cournia, Alex Dickson, Julien Michel, Michael K. Gilson, Michael R. Shirts, David L. Mobley, and John D. Chodera
Journal of Computer Aided Molecular Design 34:601, 2020. [DOI] [PDF] [bioRxiv] [GitHub]

To assess the relative efficiencies of alchemical binding free energy calculations, the SAMPL6 SAMPLing challenge asked participants to submit predictions as a function of computer effort for the same force field and charge model. Surprisingly, we found that most molecular simulation codes cannot agree on the binding free energy was, even for the same force field.

Octanol-water partition coefficient measurements for the SAMPL6 Blind Prediction Challenge

sampl6-part2-logP.png

Mehtap Işık, Dorothy Levorse, David L. Mobley, Timothy Rhodes, and John D. Chodera.
Journal of Computer Aided Molecular Design
34:405, 2020. [DOI] [bioRxiv] [data] [GitHub]

We describe the design and data collection (and associated challenges) for the SAMPL6 part II logP octanol-water blind prediction challenge, where the goal was to benchmark the accuracy of force fields for druglike molecules (here, molecules resembling kinase inhibitors).

A small-molecule pan-Id antagonist inhibits pathologic ocular neovascularization

agx51.png

Paulina M. Wojnarowicz, Raquel Lima e Silva, Masayuki Ohanka, Sang Bae Lee, Yvette Chin, Anita Kulukian, Sung-Hee Chang, Bina Desai, Marta Garcia Escolano, Riddhi Shah, Marta Garcia-Cao, Sijia Xu, Rashmi Thakar, Yehuda Goldgur, Meredith A. Miller, Ouathek Ouerfelli, Guangli Yang, Tsutomu Arakawa, Steven K. Albanese, William A. Garland, Glenn Stoller, Jaideep Chaudhary, Rajesh Soni, John Philip, Ronald C. Hendrickson, Antonio Iavarone, Andrew J. Dannenberg, John D. Chodera, Nikola Pavletich, Anna Lasorella, Peter A. Campochiaro, and Robert Benezra
Cell Reports 29:62, 2019 [DOI] [PDF]

We report the discovery and characterization of a small molecule, AGX51, with the surprising ability to inhibit the interaction of Id1 with E47, which leads to ubiquitin-mediated degradation of Ids.

Graph nets for partial charge prediction

Yuanqing Wang, Josh Fass, Chaya D. Stern, Kun Luo, and John D. Chodera.
Preprint ahead of publication.
[arXiv] [GitHub]

Graph convolutional and message-passing networks can be a powerful tool for predicting physical properties of small molecules when coupled to a simple physical model that encodes the relevant invariances. Here, we show the ability of graph nets to predict partial atomic charges for use in molecular dynamics simulations and physical docking.

Sharing data from molecular simulations

Abraham MJ, Apostolov R, Barnoud J, Bauer P, Blau C, Bonvin AMJJ, Chavent M, Chodera JD, Condic-Jurkic K, Delemotte L, Grubmüller H, Howard RJ, Lindahl E, Ollila S, Salent J, Smith D, Stansfeld PJ, Tiemann J, Trellet M, Woods C, and Zhmurov A.
Journal of Chemical Information and Modeling ASAP. [chemRxiv] [DOI] [PDF]

There is a dire need to establish standards for sharing data in the molecular sciences. Here, we review the findings of a workshop held in Stockholm in Nov 2018 to discuss this need.

Ancestral reconstruction reveals mechanisms of ERK regulatory evolution

erk-reconstruction.jpg

Dajun Sang, Sudarshan Pinglay, Rafal P Wiewiora, Myvizhi E Selvan, Hua Jane Lou, John D Chodera, Benjamin E Turk, Zeynep H Gümüş, and Liam J Holt.
eLife 2019;8:e38805 [DOI] [eLife] [PDF] [Folding@home data]

To understand how kinase regulation by phosphorylation emerged, we reconstruct the common ancestor of CDKs and MAPKs, using biochemical experiments and massively parallel molecular simulations to study how a few mutations were sufficient to switch ERK-family kinases from high- to low-autophosphorylation.

Binding thermodynamics of host-guest systems with SMIRNOFF99Frosst 1.0.5 from the Open Force Field Initiative

David R. Slochower, Neil M. Hendriksen, Lee-Ping Wang, John D. Chodera, David L. Mobley, and Michael K. Gilson.
Journal of Chemical Theory and Computation ASAP. [DOI] [bioRxiv] [GitHub]

We assess the accuracy of the SMIRNOFF99Frosst 1.0.5 force field in reproducing host-guest binding thermodynamics in comparison with the GAFF force field, demonstrating how the SMIRNOFF format for compactly specifying force fields provide comparable accuracy with 20x fewer parameters.

Small-molecule targeting of MUSASHI RNA-binding activity in acute myeloid leukemia

Gerard Minuesa, Steven K Albanese, Arthur Chow, Alexandra Schurer, Sun-Mi Park, Christina Z. Rotsides, James Taggart, Andrea Rizzi, Levi N. Naden, Timothy Chou, Saroj Gourkanti, Daniel Cappel, Maria C Passarelli, Lauren Fairchild, Carolina Adura, Fraser J Glickman, Jessica Schulman, Christopher Famulare, Minal Patel, Joseph K Eibl, Gregory M Ross, Derek S Tan, Christina S Leslie, Thijs Beuming, Yehuda Goldgur, John D Chodera, Michael G Kharas
Nature Communications 10:2691, 2019. [DOI] [bioRxiv] [GitHub] [MSKCC blog post]

We use absolute alchemical free energy calculations to identify the likely interaction site for a small hydrophobic ligand that shows activity against MUSASHI in AML.

The dynamic conformational landscapes of the protein methyltransferase SETD8

SETD8-landscape.png

Rafal P. Wiewiora*, Shi Chen*, Fanwang Meng, Nicolas Babault, Anqi Ma, Wenyu Yu, Kun Qian, Hao Hu, Hua Zou, Junyi Wang, Shijie Fan, Gil Blum, Fabio Pittella-Silva, Kyle A. Beauchamp, Wolfram Tempel, Hualing Jiang, Kaixian Chen, Robert Skene, Y. George Zheng, Peter J. Brown, Jian Jin, John D. Chodera+, and Minkui Luo+.
eLife 8:e45403, 2019. [DOI] [bioRxiv] [GitHub] [OSF] [movies] [MSKCC blog post]
* These authors contributed equally to this work
+ Co-corresponding authors

In this work, we show how targeted X-ray crystallography using covalent inhibitors and depletion of native ligands to reveal structures of low-population hidden conformations can be combined with massively distributed molecular simulation to resolve the functional dynamic landscape of the protein methyltransferase SETD8 in unprecedented atomistic detail. Using an aggregate of six milliseconds of fully atomistic simulation from Folding@home, we use Markov state models to illuminate the conformational dynamics of this important epigenetic protein.

All Folding@home simulation trajectories for this paper are available on the Open Science Framework.

The trajectories generated for this project were used as the source for a unique musical composition 'Metastable' by George Holloway, performed by the Ligeti String Quartet with visual accompaniment from Robert Arbon.

OpenPathSampling: A Python framework for path sampling simulations. II. Building and customizing path ensembles and sample schemes

David W.H. Swenson, Jan-Hendrik Prinz, Frank Noé, John D. Chodera, Peter G. Bolhuis
Journal of Chemical Theory and Computation 15:837, 2019. [DOI] [bioRxiv] [PDF] [GitHub] [openpathsampling.org]

To make powerful path sampling techniques broadly accessible and efficient, we have produced a new Python framework for easily implementing path sampling strategies (such as transition path and interface sampling) in Python. This second publication describes advanced aspects of the theory and details of how to customize path ensembles.