Some thoughts about what I am looking forward this year from my vantage point of computational molecular biology. One mega-trend for me; we will definitely see more AI methods of all sorts emerge.
Comments
Log in with your Bluesky account to leave a comment
The mistake from further away I think is to view AI through the lens of Large Language Models - these are great (well... at least very useful) for text, but so much in science (and in life!) is not text! The set of "tailored" or "narrow" AI in science will only increase
(BTW - I think of LLMs as just tailored or narrow - narrow around text. I think it's weird to elevate text based descriptions of the world as somehow more general than others)
These AI techniques will range from the "useful" (say - segmentation) through to the transformative (AlphaFold will never get old I suspect!). I am very excited about our work on DELPHI on attention based generative transformers for healthcare tokens (eg, ICD10): https://www.medrxiv.org/content/10.1101/2024.06.07.24308553v1
I am now far more comfortable about how potentially interpretable AI methods are - nearly every AI method has some sort of internal space, and you can get access to it; that space doesn't come with labels on the axes, but a scientist can bring "out of model" data to explore these spaces
The other trend is bringing experimental techniques closer to AI methods directly - so called by my @EMBL colleagues AI with "lab in the loop" - could be a "traditional" lab setting but more excited about cases which are roboticised, with feedback loops in the minutes, seconds or even milliseconds
Could you elaborate? Language is general in that it can describe many things (and contains mathematics via notation). Traditionally narrow AIs working with images etc are much less general, or not? (I’m not saying language will be more efficient than abstract embedding spaces, maybe even short term)
Just think about maps - even quite trivial maps can't be easily expressed as language except language as drawing primitives where you need - well - a representation as map (flat or geodesic).
Same for... DNA sequences of genomes. Sure you can say "chromosome one is Adenosine followed by Cytosine followed by Adenosine ..." for ... 3 billion bases ... but it is both clunky and almost certainly there is a better representation of what is going on.
In the world around us there are many many things - maps, weather, money flows, genomes, calendars ... all sorts which although you could use language it is clear language is a very poor representation of what is going on
Comments