Treating the LLM like a compiler is a much more scalable, extensible and composable mental model than treating it like a junior dev.
A version that DID work like a compiler would be super interesting - it could replace the function body with generated Python code on your first call and then reuse that in the future, maybe even caching state on disk rather than in-memory.
I can see this eventually going in the direction of "bidirectional synchronization" of NL representation and code representation (similar to how jupytext allows you work with notebooks in browser or markdown in editor). But a single representation that's completely NL and deliberately throwing away a code representation sounds like it would be the opposite of productivity..
I would like to try something like this in Rust: - you use a macro to stub out the body of functions (so you just write the signature) - the build step fills in the code and caches it - on failures the, the build step is allowed to change the function bodies generated by LLMs until it satisfies the test / compile steps - you can then convert the satisfying LLM-generated function bodies into a hard code (or leave it within the domain of "changeable by the llm")
It sandboxes what the LLM can actually alter, and makes the generation happen in an environment where you can check right away if it was done correctly. Being Rust, you get a lot of more verifications. And, crucially, keeps you in the driver's seat.
Yeah, I do think that LLMs acting as compilers for super high-level specs (the new "code") is a much better approach than chatting with a bot to try to get the right code written. LLM-derived code should not be "peer" to human-written code IMO; it should exist at some subordinate level.
The fact that they're non-deterministic makes it a bit different from a traditional compiler but as you say, caching a "known good" artifact could work.
You can even pin the last result:
pinned function main() {
// Print "Hello World" to the console
}
Llm is a cool cli tool, but IMO litellm is a better Python library.
The problem with LiteLLM's approach is that every model provider needs to be added to the core library - in https://github.com/BerriAI/litellm/tree/main/litellm/llms - and then shipped as a new release.
LLM uses plugins because then there's no need to sync new providers with the core tool. When a new Gemini feature comes out I ship a new release of https://github.com/simonw/llm-gemini - no need for a release of core.
I can wake up one morning and LLM grew support for a bunch of new models overnight because someone else released a plugin.
I'm not saying "LLM is better than LiteLLM" here - LiteLLM is a great library with a whole lot more contributors than LLM, and it's also been fully focused on being a great Python library - LLM has also had more effort invested in the CLI aspect than the Python library aspect so far.
I am confident that a plugin system is a better way to solve this problem generally though.
Imagine this: It would be cool when these functions essentially boiled down to a distilled tiny model just for that functionality instead of an api call to foundation one.
Pasting a piece of code into an LLM with the prompt "comment the shit out of this" works quite well.
You run it like this:
llm install llm-docsmith
llm docsmith ./scripts/main.py
And it uses a Python concrete syntax tree (with https://pypi.org/project/libcst/) to apply changes to just the docstrings without risk of editing any other code.TBH I find doc strings very tedious to write. I can see how this would be a great specification for an LLM but I dont know that its actually better than a plain text description of the function since LLMs can handle those just fine and they are easier to write.
I initially used the same approach in my library, but changed it to explicitly pass the llm object around and in actual production code it's easier/more flexible to use.
Examples (2nd one also with docstring-based llm query and structured answer): https://github.com/senko/think?tab=readme-ov-file#examples
shaism•1w ago
At that time, LLMs weren't as proficient in coding as they are today. Nowadays, the decorator approach might even go further and not just wrap LLM calls but also write Python code based on the description in the Docstring.
This would incentivize writing unambiguous DocStrings, and guarantee (if the LLMs don't hallucinate) consistency between code and documentation.
It would bring us closer to the world that Jensen Huang described, i.e., natural language becoming a programming language.
psunavy03•1w ago
And now the COBOL devs are retiring after a whole career . . .
pizza•1w ago
psunavy03•1w ago
The reason it looks so odd today is that so much of modern software is instead the intellectual heir of C.
And yeah, the "skill cap" of describing things is theoretically infinite. My point was this has been tried before and we don't yet know how the actual limitations of an LLM come close to that ideal. People have been trying for decades to describe things in English that still ultimately need to be described in code for them to work; that's why the software industry exists in the first place.