Granted wished this had been around when I started my journey instead of having to delve into things like the Amber manual... (which I will grant is wonderful for its information but the organization isn't as convenient).
Do you have any resources that you recommend on coarse graining?
I am really interested in the topic.
I also believe the work of Frank Noe is someone to watch for in the ML potential space for proteins.
I'm also always happy to talk about CG.
My email OTOH is in my bio.
What complicates things is the experimental data we get back from labs to validate MD behavior is extremely tricky to work with. Most of what we're working with is NMR data which shows flexibility in areas of the proteins, but even then we're left with these mathematical models to attempt to "make sense" of the flexibility and infer dynamics from that. Sometimes it feels like an art and a science trying to get meaningful insights for lab data like this.
It's extremely difficult to experimentally verify any MD model since, as mentioned in the article, most of the data we're working with are static mugshots in the form of crystal structures.
I'm curious if you've worked with any of those models and how they relate to NMR data and MD simulations.
I've also written a potentially helpful coverage piece on extracting conformations from cryo-EM data: https://www.owlposting.com/p/a-primer-on-ml-in-cryo-electron...
if you can reach out at all, you can find me at [masterfully dot blundered] on the normal g-domain. I briefly skimmed your profile for contact info but could not find any.
Its very accessible and I found it very interesting — https://youtu.be/PGqCeSjNuTY?feature=shared
For MD, specifically the type talked about here, we aren't taking in all the quantum effects, and that is known. Crystalizing molecules, especially large either dynamic proteins or ones in lipids is hard. Crystalizing during transitory states is orders of magnitude more difficult. MD allows us to visualize those transitory states and was used, for example, to observe the unfolding of the spike protein in Sars-Cov2 to assist in designing mRNA vaccines, because the important amino acids could be observed.
There is a lot of times where it is good enough, outpaces current experimental techniques, etc that it is the tool for the job. But it is not perfect and very rarely can stand completely on its own, in say drug discovery or other fields.
For anyone with further interest in MD, two of the popular engines, Amber and Gromacs have excellent documentation for learning (1, 2). MDAnalysis is a popular analysis package. Their docs give a great rundown of what type of information you can glean from MD (3). If you’re strictly interested in eye candy, there’s a a fabulous blender plugin for visualizing MD simulations and protein structures (4). I also wrote a little Python program for setting up simulation systems you can do some fun stuff with it (5).
(1) https://ambermd.org/Manuals.php
(2) https://manual.gromacs.org/current/index.html
(3) https://www.mdanalysis.org/pages/documentation/
And there is not a 100x hero hacker that could clean this Augean stable.
roughly•1d ago
> And understanding molecular motion is key for everything in biology, everything in biology is vibrating molecules underneath the surface!
Coming into bio as a programmer, this is the absolute sin qua non rule you need to internalize: there are no boundaries between systems, because everything is jiggling atoms. DNA encodes for genes, except the transcription process is heavily mediated by the physical environment and physical constraints of accessing the DNA; RNA transcribes to amino acid strings, except it’s also a molecule, and so sometimes it folds into a structure and just does shit itself; proteins have a function, except sometimes they have many functions, because the “lock and key” metaphor isn’t wrong, except when you’ve got a billion locks and your key’s kinda floppy, it’ll probably fit more than one. Nature plays with physical systems and will repurpose anything to do anything else - the informatics only take you so far, all the real action is vibrating molecules.
holodro•1d ago
(Similar background as you.) Another sine qua non rule is that evolution created biology, it wasn't engineered like software and it doesn't decompose like software. Evolution creates hairballs that has don't respect traditional engineering boundaries and abstraction hierarchies.
From that, along with probabilistic molecular jiggling, we get biological systems that are quite difficult to understand, predict, and control.
kurthr•1d ago
What this means is that running an experiment in many fields is so difficult that replication is a real challenge. There are so MANY ways you can screw up, or you could just have a statistical fluke that screws you over. Just a tiny contamination or seemingly irrelevant missed step will cause a failure. That's why the idea of having journals composed of failed experiments just doesn't work. Unstated experimental process assumptions are legion. Sometimes an expert can look at the result and see what you've done wrong (like bad contacts in "Electron Band Structure In Germanium, My Ass") and often not even that. Sometimes there's something interesting in the failure, but 99% of the time it's just your pitch is so bad you can't hit the strike zone. Do better!
The things that are easy to replicate (and usually they've been specifically designed that way like Starbucks' over roasted beans), have actually been reduced to engineering. They're not on the edge where scientists can get published. That way perverse incentive madness lies.
Enjoy the controlability of inputs, the repeatability of bugs, the near perfection of compilers and memory allocation, the complete independence of variables while you can. Unless that is, you like Rowhammer and voltage glitch attacks.