Bayesian inference and error modeling for experimental data
/All experimental assay data contains error arising from uncertainties in initial compositions, dispensed masses or volumes, measurement noise, model fitting error, and intrinsic biological variability. Accounting for this error to produce a reliable estimate of the uncertainty of experimentally-derived quantities is critical, as this is the basis for testing hypotheses or building predictive models, but it is often difficult to even identify the dominant sources of assay error, let alone propagate them.
Our lab uses two primary tools to both build predictive models of assay error and incorporate all sources of error and uncertainty in data analysis: the bootstrap principle and Bayesian inference. The bootstrap principle allows us to simulate sources of error and uncertainty in experimental assays, which provides not only a means for optimizing assay configurations and ensuring that the resulting data will meet uncertainty objectives, but also provides a way to assign meaningful experimental uncertainties to assay data without error estimates for which primary data are unavailable. Bayesian inference provides a powerful set of tools for incorporating all known sources of error and uncertainty (such as compound purity, dispensing errors, measurement errors, and uncertainty in the underlying biophysical binding model), as well as model selection, assay optimization based on expected information gain, and characterization of confidence intervals. We use the same fundamental computational strategies for sampling from Bayesian posteriors---Markov chain Monte Carlo (MCMC)---as we use in molecular simulations. Many of the same advanced sampling techniques we have developed to make molecular simulations more efficient---such as Gibbs sampling replica exchange techniques, nonequilibrium candidate Monte Carlo (NCMC), and multiensemble reweighting techniques like MBAR---can be employed to make sampling from the Bayesian posterior highly efficient.
Current projects in this area utilize Bayesian inference for the analysis of absorbance and fluorescence data collected from standard laboratory plate readers with the goal of rigorously characterizing confidence intervals in kinase inhibitor binding assays; Bayesian analysis of isothermal titration calorimetry (ITC) to accurately capture joint uncertainties in thermodynamic parameters of protein-ligand interactions; and Bayesian inference of single-molecule experiments where small-number statistics can lead to large uncertainties in some kinetic parameters.
SOFTWARE
assaytools: Bayesian biophysical analysis of absorbance and fluorescence assay data from common plate readers [experimental]
bayesian-itc: Bayesian analysis of isothermal titration calorimetry (ITC) data [experimental]
bhmm: Bayesian hidden Markov model toolkit for analysis of single-molecule experiments and molecular simulation data
RESOURCES
assaytools: general API for describing experimental assays in a human- and computer-readable format
COLLABORATORS
David D. L. Minh (Illinois Institute of Technology): Bayesian modeling of isothermal titration calorimetry (ITC)
Frank Noé (Freie Universität Berlin): Bayesian hidden Markov modeling of single-molecule experiments
PERSONNEL
Sonya M. Hanson (postdoctoral fellow): Bootstrap modeling and Bayesian inference for fluorescence assays
Ariën S. (Bas) Rustenburg (PBSB graduate student): Bayesian inference for isothermal titration calorimetry (ITC)
Chaya Stern (TPCB graduate student): Bayesian modeling of single-molecule experiments
SELECTED PUBLICATIONS
Modeling error in experimental assays using the bootstrap principle: Understanding discrepancies between assays using different dispensing technologies
Sonya M. Hanson, Sean Ekins, and John D. Chodera.
Journal of Computer Aided Molecular Design 29:1073, 2015. [DOI] [PDF] // IPython notebook [GitHub] // preprint: [bioRxiv]
Bayesian hidden Markov model analysis of single-molecule force spectroscopy: Characterizing kinetics under measurement uncertainty
John D. Chodera, Phillip Elms, Frank Noé, Bettina Keller, Christian M. Kaiser, Aaron Ewall-Wice, Susan Marqusee, Carlos Bustamante, and Nina Singhal Hinrichs.
preprint: [arXiv] // used in our 2011 Science paper