I think LLMs will indirectly move towards being fuzzy VMs that output tokens much like VM instructions so they can prepare multiple conditional branches of tool calling, load/unload useful subprograms, etc. It might not be expressed exactly like that, but I think given how LLMs today are very poor at reusing things in their context window, we will naturally add features that take us in this direction. Also see frameworks like CodeAct[2] etc.
[1] This can be converted to a single tool call with many arguments instead, which you’ll see providers do in their internal tools, but it’s just messier.
There's an open source tool being developed that is sort of along these lines: https://github.com/raestrada/storycraftr
But:
- it expects the user to be the orchestrator, rather than running fully unattended in a loop, and
- it expects the LLM to output a whole chapter at a time, rather than doing surgical edits: https://github.com/raestrada/storycraftr/blob/b0d80204c93ff1...
(It does use a vector store to help the model get context from the rest of the book, so it doesn't assume everything is in context.)
I’d like to apply what is being suggested in this post, but it doesn’t make sense to me to have to give an LLM access to a text editor just to write a novel. Isn’t there a better way?
anko•4h ago
I believe we juggle 7 (plus or minus 2) things in our short term memory. Maybe short term memory could be a tool!
We also don't have the knowledge of the entire internet in our heads, but meanwhile we can still be more effective at strategy/reasoning/planning. Maybe a much smaller model could be used if the only thing it had to do is use tools and have a basic grasp on a language.
dijit•2h ago
It was good advice for me.