what are the most important open problems in molecular simulation right now that stand to benefit from ML-based methods? any good reviews or references to get up to speed rapidly? @gcorso.bsky.social @hannes-stark.bsky.social?
Comments
Log in with your Bluesky account to leave a comment
Sampling e.g. generating independent statistics from the Boltzmann distribution or solutions of an SDE describing the dynamics (Langevin etc). The key challenge is to lower the cost/in-dependent sample, in a system agnostic way.
this was what i thought as well; morally speaking you could say learning surrogate models for the "ground-truth" MD sampler?
i've read your recent ITO papers to try to learn more about this: what's the right assumption on the data? do we have it, or do we need to sample given U but with no data?
Sure this is one way of putting it.
Happy that you took the time to read about our work. There are two different schools, either sampling using a potential energy model @msalbergo.bsky.social has some recent work in this direction (NETS i believe), and then training on simulation data.
I would say right now, the data-driven approaches seem to be 'working' best, in the sense that, we can quickly amortize the cost of data generation if the model allows us to generalize to other systems while being orders of magnitude faster than the reference simulations. Of course, [...]
assuming the model is somehow faithfully reproducing the ground truth. From my perspective it is unclear now how big of a difference these methods can actually make, e.g. how small can we make a training set which generalizes to much larger set of systems?
I forgot to mention the enhanced sampling community. Here the focus is (mostly) on discovering slowly relaxing dofs with NNs, and then biasing the dynamics along these dofs. This is an effective approach but suffers a bit from the chicken/egg problem: to find a slow dof you need sampling.
Comments
i've read your recent ITO papers to try to learn more about this: what's the right assumption on the data? do we have it, or do we need to sample given U but with no data?
Happy that you took the time to read about our work. There are two different schools, either sampling using a potential energy model @msalbergo.bsky.social has some recent work in this direction (NETS i believe), and then training on simulation data.