There is also the possibility of building intelligent workspaces that could prove useful in aiding scientific research:
A good wrapper has deep domain knowledge baked into it, combined with automation and expert use of the LLM.
It maybe isn’t super innovative but it’s a bit of an art form and unlocks the utility of the underlying LLM
To present a potential usecase: there's a ridiculous and massive backlog in the Indian judicial system. LLMs can be let loose on the entire workflow: triage cases (simple, complicated, intractable, grouped by legal principles or parties), pull up related caselaw, provide recommendations, throw more LLMs and more reasoning at unclear problems. Now you can't do this with just a desktop and chatgpt, you need a systemic pipeline of LLM-driven workflows, but doing that unlocks potentially billions of dollars of value that is otherwise elusive.
Or just make some up...
Can google search hallucinate webpages?
AFAIK, doing proper RAG is much, much more than this.
What's your technical background if you don't mind me asking?
We are going to see the same for anything that Claude or similar can't handle out of the box.
I'm personally sceptical that LLMs can currently do this (and it's based on Claude that does test this) but still interesting to see.
> The use of an attenuated B. anthracis strain, low spore concentrations, ineffective dispersal, a clogged spray device, and inactivation of the spores by sunlight are all likely contributing factors to the lack of human cases.
Now you may say, that's bacteria, what about viruses? A similar set of problems would arise, how do you successfully grow virus to high titers? Even vaccine companies struggle to do this with certain viruses. Then the issue of dispersal, infectivity and mortality arise (too quick, it kills the host without spreading and authorities will notice, too slow, same problem: authorities will notice). I haven't even mentioned biological engineering which requires years of technical knowledge and laboratory experience combined with a intimate knowledge of the organism you're working with.
What worries me the most is nature springing a new influenza subtype. Our farming practices, especially in developing countries, is bound to breed a new subtype. It happened in 2009 (H1N1pdm) and it is bound to happen again. We got lucky with H1N1pdm.
1. https://pmc.ncbi.nlm.nih.gov/articles/PMC3322761/ 2. https://en.wikipedia.org/wiki/Tokyo_subway_sarin_attack
The argument is that if you just ask claude code to do niche biomed tasks, it will not have the knowledge to do it like that by just searching pubmed and doing RAG on the fly, which is fair, given the current gen of LLM's. It's an interesting approach, they show some generalization on the paper(with well known tidy datasets), but real life data is messier, and the approach here(correct me if im wrong) is to identify the correct tool for a task, and then use the generic python exec tool to shape the data into the acceptable format if needed, try the tool and go again.
It would be useful to use the tools just as a guidance to inform a generic code agent imo, but executing the "verified" hardcoded tools narrows the error scope, as long as you can check your data is shaped correctly, the analysis will be correct. Not sure how much of an advantage this is in the long term for working with proprietary datasets, but it's an interesting direction
If biomedical research and paper analysis is of interest to you, I've been working on a set of open source projects that enable RAG over medical literature for a while.
PaperAI: https://github.com/neuml/paperai
PaperETL: https://github.com/neuml/paperetl
There is also this tool that annotates papers inline.
AnnotateAI: https://github.com/neuml/annotateai
freedomben•6h ago
Of course there's also the possibility of engineering new drugs/treatments and things, which is also super exciting.
panabee•1h ago
For instance, genomic data that may seem identical may not actually be identical. In classic biological representations (FASTA), canonical cytosine and methylated cytosine are both collapsed into the letter "C" even though differences may spur differential gene expression.
What's the optimal tokenization algorithm and architecture for genomic models? How about protein binding prediction? Unclear!
There are so many open questions in biomedical ML.
The openness-impact ratio is arguably as high in biomedicine as anywhere else: if you help answer some of these questions, you could save lives.
Hopefully, awesome frameworks like this lower barriers and attract more people.