- All models are terrible at generating line numbers for a proper diff, give up on them
- Some models (Owl-alpha) must have been post-trained on Codex transcripts, because they occasionally push its V4A patch format into any diff tool available
- Codex puts a lot of info in its system prompt about the desired patch style, making larger hunks instead of granular ones, etc
Only need ~650 tokens of system prompt for it to work. It’s pretty stellar.
It's amazing anyone watched the last 2 decades of tech's enshitification and wants to hook their wagon to this shitshow.
Doesn't always work, for better performance you can kneel and start begging
Is this still a thing? I thought Anthropic walked back the silent downgrades so now all the different domains downgrade non-silently.
The curl command is extremely popular so models seem to be really good at using it.
Also I like that curl uses a bash syntax and my platform requires JSON payloads; it makes the separation clear to the agent. I find it to be very reliable.
dofm•1h ago
But:
"Now I’m somewhat worried about the track we’re on here. Alternative tool schemas might not just be unfamiliar. They might be implicitly punished by post-training that optimizes for one particular, forgiving tool ecology."
Only implicitly?
--
Many decades ago when I was working on research related to using MOOs as a learning environment, you would add "tool calls" into the stream of text that a MOO object might generate, so your rich client would e.g. show a picture, load a web page in a frame, move you on a map, trigger a change in an on-screen representation of an object.
Everyone who tried this in MUD/MUSH/MOO clients ran into more or less the same problems that LLM clients do: any attempt to shoehorn control sequences into in-band content was riddled with security risks, objects accidentally triggering the wrong interface etc.; you could never truly communicate out-of-band.
The more I read about how agentic harnesses work, the less embarrassed I feel about the code twenty-something-year-old me wrote in a MOO client.