I wish for a vetting tool. Have an LLM examine the code then write a spec of what it reads and writes, & you can examine that before running it. If something in the list is suspect.. you’ll know before you’re hosed not after :)
LLMs are NOT THOROUGH. Not even remotely. I don't understand how anyone can use LLMs and not see this instantly. I have yet to see an LLM get a better failure rate than around 50% in the real world with real world expectations.
Especially with code review, LLMs catch some things, miss a lot of things, and get a lot of things completely and utterly wrong. It takes someone wholly incompetent at code review to look at an LLM review and go "perfect!".
Edit: Feel free to write a comment if you disagree
If you build an intuition for what kinds of asks an LLM (agent, really) can do well, you can choose to only give it those tasks, and that's where the huge speedups come from.
Don't know what to do about prompt injection, really. But "untrusted code" in the broader sense has always been a risk. If I download and use a library, the author already has free reign of my computer - they don't even need to think about messing with my LLM assistant.
If the first llm wasn’t enough, the second won’t be either. You’re in the wrong layer.
margarina72•3h ago
1: https://aider.chat/
KronisLV•51m ago
Ofc some might prefer the pure CLI experience, but mentioning that because it also supports a lot of providers.