Eric S. Raymond: why is there such a huge variance in results from using LLMs?

https://twitter.com/esrtweet/status/2016849708254179501

6•dist-epoch•1w ago

Comments

sara_builds•1w ago

The variance mostly comes down to prompt craft and context management.

People who get consistently good results have usually internalized a few things: (1) being explicit about constraints and output format, (2) providing relevant context without noise, (3) matching the model to the task (reasoning-heavy vs creative vs code), and (4) iterating on the prompt when something fails rather than assuming the model is broken.

I've seen the same person get wildly different results depending on whether they ask "write code to do X" vs "I need a function that takes A, returns B, handles edge case C, and should be optimized for D. Here's the existing code it needs to integrate with: [context]."

The gap between those two approaches can be a 10x difference in usefulness. Most of the "LLMs are useless" crowd and the "LLMs are magic" crowd are just working with very different prompt habits.

tjr•1w ago

It appears to me that the people who consistently get the best results from LLM coding tools are prompting fairly close to the code. Maybe not quite at the level of writing pseudocode, but close enough that they really still need to understand software development.

Which seems to not quite gel with the notion of, you don't need programmers, you don't need to know how to program, etc.

I feel pretty confident that, in fact, you don't need to. You probably can get good results without having a clue what you're doing, if you prompt well enough, or prompt long enough, or prompt repeatedly until it works. But I think you will more reliably, maybe even more quickly, get good results if you do know what you're doing, and if you stay reasonably engaged with the development, even if not literally writing the code yourself.

armchairhacker•1w ago

What are your exact prompts (including project context) and generated code?

And for those who are struggling with LLMs, what are their prompts and code?

FrankWilhoit•1w ago

He thinks they ought to converge. What does he think they ought to converge upon? How will he know that thing when he sees it? If he will know it when he sees it, why does he need help making it?

The answer to all of these, of course, is that convergence is not expected and correctness is not a priority. The use of an LLM is a boasting point, full stop. It is a performance. It is "look, Ma, no coders!". And it is only relevant, or possible, because although the LLM code is not right, the pre-LLM code wasn't right either. The right answer is not part of the bargain. The customer doesn't care whether the numbers are right. They care how the technology portfolio looks. Is it currently fashionable? Are the auditors happy with it? The auditors don't care whether the numbers are right: what they care about is whether their people have to go to any -- any -- trouble to access or interpret the numbers.

Show HN: SafeClaw – a way to manage multiple Claude Code instances in containers

The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+

The Evolution of the Interface

Azure: Virtual network routing appliance overview

Seedance2 – multi-shot AI video generation

Πfs – The Data-Free Filesystem

Go-busybox: A sandboxable port of busybox for AI agents

Quantization-Aware Distillation for NVFP4 Inference Accuracy Recovery [pdf]

xAI Merger Poses Bigger Threat to OpenAI, Anthropic

Atlas Airborne (Boston Dynamics and RAI Institute) [video]

Zen Tools

Is the Detachment in the Room? – Agents, Cruelty, and Empathy

The purpose of Continuous Integration is to fail

Apfelstrudel: Live coding music environment with AI agent chat

What Is Stoicism?

What happens when a neighborhood is built around a farm

Every major galaxy is speeding away from the Milky Way, except one

Extreme Inequality Presages the Revolt Against It

There's no such thing as "tech" (Ten years later)

What Really Killed Flash Player: A Six-Year Campaign of Deliberate Platform Work

Ask HN: Anyone orchestrating multiple AI coding agents in parallel?

Show HN: Knowledge-Bank

Show HN: The Codeverse Hub Linux

Take a trip to Japan's Dododo Land, the most irritating place on Earth

British drivers over 70 to face eye tests every three years

BookTalk: A Reading Companion That Captures Your Voice

Is AI "good" yet? – tracking HN's sentiment on AI coding

Show HN: Amdb – Tree-sitter based memory for AI agents (Rust)

OpenClaw Partners with VirusTotal for Skill Security

Show HN: Seedance 2.0 Release