Profile avatar
proteinator.bsky.social
PhD candidate in bioinformatics Protein structure prediction / Protein design
17 posts 23 followers 55 following
Getting Started
Conversation Starter
comment in response to post
๐Ÿคก๐Ÿคก๐Ÿคก
comment in response to post
๐Ÿ˜‚๐Ÿ˜‚๐Ÿ˜‚
comment in response to post
Ah yeah, didnโ€™t fit into the previous post: thereโ€™s no bias in sec structure distro in the PDB. Also in the paper I attached above we have a plot showing the mean of PDB regards to hels and strands and of different gen models ๐Ÿ˜…
comment in response to post
Gen models just learn the locality of the helices very quickly and โ€œcheatโ€ on the metrics by overrepresenting helices. Especially for bigger proteins > 500 aas it really becomes apparent..
comment in response to post
Not sure itโ€™s that easy. We recently proposed a checkpointing selection criterion (arxiv.org/abs/2411.05238) to match better the distribution of secondary structure elements of native proteins from PDB but it doesnโ€™t seem to work too good.
comment in response to post
the consequence of that is pretty well known - rock-stable rigid proteins ๐Ÿค 
comment in response to post
Keep me in the loop, I'd be interested in seeing it! We just need someone to run this multiple times xD However, my guess would be that strands will approach 0% on avg in this setup.
comment in response to post
Typically gen models for protein structure are mode collapsed towards alpha-helices, AF3 won't be an exception here either if used in such a way. The reason why it hallucinates helices is just simply they're easy to learn as an optimal minimization of the diffusion loss fct during training
comment in response to post
That's an interesting assumption though I don't think this will work for something bigger than let's say 150-200aa. And clearly it will hallucinate helical bundles just arranged in slightly (maybe not) different topologies. Not sure it's of any meaning ๐Ÿ˜…
comment in response to post
I'm not sure I'm following. How is it useful if seq -> str mapping is not guaranteed anymore? This puts the equality sign between random seq and pMPNN generated seq
comment in response to post
yeah, doesn't change the thing. There should be just an alternative, non-static encoding of protein structures. Maybe as a multivariate energy (not in the physical sense) landscape of protein conformations ๐Ÿ˜€
comment in response to post
Well, it clearly shows the over-reliance on conformations of crystal structures. I guess training should include some probability distros of protein structures accounting for dynamics, though it's obviously a non-trivial problem to solve..
comment in response to post
Looks like recursion sneaked in ๐Ÿ˜‚
comment in response to post
what is this ๐Ÿ˜… ...
comment in response to post
At least it's encouraging to see that the old but gold SE(3) eq architecture outperforms all atom diffusion models in the low RNA structure data regime ๐Ÿ˜€
comment in response to post
As promised on twitter - only horses ๐Ÿ˜‚