[1] (pretty sure this is the right one): https://youtu.be/CmIGPGPdxTI
I believe (and practice) that spec-based development is one of the future methodologis for developing projects with LLMs. At least it will be one of the niches.
Author thinks about specs as waterfalls. I think about them as a context entrypoint for LLMs. Giving enough info about the project (including user stories, tech design requirements, filesystem structure and meaning, core interfaces/models, functions, etc) LLM will be able to build sufficient initial context for the solution to expand it by reading files and grepping text. And the most interesting is that you can make LLM to keep the context/spec/projetc file updated each time LLM updates the project. Viola: now you are in agile again: just keep iterating on the context/spec/project
Sweet spot will be a moving target. LLMs build-in assumptions, ways to expand concepts will be chaning with LLMs development. So best practices will change with change of the LLMs capabilities. The same set of instructions, not too detailed, were so much better handled by sonnet 4 than sonnet 3 in my experience. Sonnet 3.5 was for me a breaking point which showed that context-based llm development is a feasible strategy.
I though about the concept of this ort of methodology before "agent" (which I would define as "sideeffects with LLM integration") was marketed into community vocabulary. And I'm still rigidly sticking to what I consider "basics". Hope that does not impede understanding.
What's not waterfall about this is lost on me.
Sounds to me like you're arguing waterfall is fine if each full run is fast/cheap enough, which could happen with LLMs and simple enough projects. [0]
Agile was offering incremental spec production , which had the tremendous advantage of accumulating knowledge incrementally as well. It might not be a good fit for LLMs, but revising the definition to make it fit doesn't help IMHO.
[0] Reminds me that reducing the project scopes to smaller runs was also a well established way to make waterfall bearable.
Exactly. There is a spec, but there is no waterfall required to work and maintain it. Author from the article dismissed spec-based development exactly because they saw resemblance with waterfall. But waterfall isn't required for spec-centric development.
The problem with waterfall is not that you have to maintain the spec, but that a spec is the wrong way to build a solution. So, it doesn't matter if the spec is written by humans or by LLMs.
I don't see the point of maintaining a spec for LLMs to use as context. They should be able to grep and understand the code itself. A simple readme or a design document, which already should exist for humans, should be enough.
The downfall of Waterfall is that there are too many unproven assumptions in too long of a design cycle. You don't get to find out where you were wrong until testing.
If you break a waterfall project into multiple, smaller, iterative Waterfall processes (a sprint-like iteration), and limit the scope of each, you start to realize some of the benefits of Agile while providing a rich context for directing LLM use during development.
Comparing this to agile is missing the point a bit. The goal isn't to replace agile, it's to find a way that brings context and structure to vibe coding to keep the LLM focused.
The frustration thomascountz describes (tweaking, refining, reshaping) isn't a failure of methodology (SDD vs. Iteration). It's 'cognitive overload' from applying a deterministic mental model to a probabilistic system.
With traditional code, the 'spec' is a blueprint for logic. With an LLM, the 'spec' is a protocol for alignment.
The 'bug' is no longer a logical flaw. It's a statistical deviation. We are no longer debugging the code; we are debugging the spec itself. The LLM is the system executing that spec.
This requires a fundamental shift in our own 'mental OS'—from 'software engineer' to 'cognitive systems architect'.
I would add that to my opinion if previously code production/management was a limiting factor in software development, today it's not. The conceptualisation (onthology, methodology) of the framework (spec-centric devlopment) for the system production and maintenance (code, artifacts, running system) becomes a new limiting factor. But it's matter of time we'll figure out 2-3 methodologies (like it happened with the agile's scrum/kanban) which will become a new "baseline". We're at the early stages when new "laws of llm development" (as in "laws of physics") is still being figured out.
You provide basic specs and can work with LLMs to create thorough test suites that cover the specs. Once specs are captured as tests, the LLM can no longer hallucinate.
I model this as "grounding". Just like you need to ground an electrical system, you need to ground the LLM to reality. The tests do this, so they are REQUIRED for all LLM coding.
Once a framework is established, you require tests for everything. No code is written without tests. These can also be perf tests. They need solid metrics in order to output quality.
The tests provide context and documentation for future LLM runs.
This is also the same way I'd handle foreign teams, that at no fault of their own, would often output subpar code. It was mainly because of a lack of cultural context, communication misunderstandings, and no solid metrics to measure against.
Our main job with LLMs now as software engineers is a strange sort of manager, with a mix of solutions architect, QA director, and patterns expertise. It is actually a lot of work and requires a lot of human people to manage, but the results are real.
I have been experimenting with how meta I can get with this, and the results have been exciting. At one point, I had well over 10 agents working on the same project in parallel, following several design patterns, and they worked so fast I could no longer follow the code. But with layers of tests, layers of agents auditing each other, and isolated domains with well defined interfaces (just as I would expect in a large scale project with multiple human teams), the results speak for themselves.
I write all this to encourage people to take a different approach. Treat the LLMs like they are junior devs or a foreign team speaking a different language. Remember all the design patterns used to get effective use out of people regardless of these barriers. Use them with the LLMs. It works.
Personally, I tried SDD, consciously trying to like it, but gave up. I find writing specs much harder than writing code, especially when trying to express the finer points of a project. And of course, there is also that personal preference: I like writing code, much more than text. Yes, there are times where I shout "Do What I Mean, not what I say!", but these are mostly learning opportunities.
And while at it, I found out that using TDD also helps.
After Claude finally produced a significant amount of code, and after realizing it hadn't built the right thing, I was back to the drawing board to find out what language in the spec had led it astray. Never mind digging through the code at this point; it would be just as good to start again than to try to onboard myself to the 1000s of lines of code it had built... and I suppose the point is to ignore the code as "implementation detail" anyway.
Just to make clear: I love writing code with an LLM, be it for brainstorming, research, or implementation. I often write—and have it output—small markdown notes and plans for it to ground itself. I think I just found this experience with SDD quite heavy-handed and the workflow unwieldy.
What LLMs bring to the picture is that "spec" is high-level coding. In normal coding you start by writing small functions then verify that they work. Similarly LLMs should perhaps be given small specs to start with, then add more functions/features to the spec incrementally. Would that work?
Compared to what an architect does when they create a blueprint for a building, creating blueprints for software source code is not a thing.
What in waterfall is considered the design phase is the equivalent of an architect doing sketches, prototypes, and other stuff very early in the project. It's not creating the actual blue print. The building blue print is the equivalent of source code here. It's a complete plan for actually constructing the building down to every nut and bolt.
The big difference here is that building construction is not automated, costly, and risky. So architects try to get their blueprint to a level where they can minimize all of that cost and risk. And you only build the bridge once. So iterating is not really a thing either.
Software is very different; compiling and deploying is relatively cheap and risk free. And typically fully automated. All the effort and risk is contained in the specification process itself. Which is why iteration works.
Architects abandon their sketches and drafts after they've served their purpose. The same is true in waterfall development. The early designs (whiteboard, napking, UML, brainfart on a wiki, etc.) don't matter once the development kicks off. As iterations happen, they fall behind and they just don't matter. Many projects don't have a design phase at all.
The fallacy that software is imperfect as an engineering discipline because we are sloppy with our designs doesn't hold up once you realize that essentially all the effort goes into creating hyper detailed specifications, i.e. the source code.
Having design specifications for your specifications just isn't a thing. Not for buildings, not for software.
Real software engineering does exist. It does so precisely in places where you can't risk trying it and seeing it fail, like control systems for things which could kill someone if they failed.
People get offended when you claim most software engineering isn't engineering. I am pretty certain I would quickly get bored if I was actually an engineer. Most real world non-software engineers don't even really get to build anything, they're just there to check designs/implementations for potential future problems.
Maybe there are also people in the software world who _do_ want to do real engineering and they are offended because of that. Who knows.
That said, there is a bit of redundancy between software design and source code. We tend to rather get rid of the development of the latter than the former though, i.e. by having the source code be generated by some modelling tool.
> it's really just a spec that gets turned into the thing we actually run. It's just that the building process is fully automated. What we do when we create software is creating a specification in source code form.
Agree. My favourite description of software development is specification and translation - done iteratively.
Today, there are two primary phases:
1. Specification by a non-developer and the translation of that into code. The former is led by BAs/PMs etc and the output is feature specs/user stories/acceptance tests etc. The latter id done by developers: they translate the specs into code.
2. The resulting code is also, as you say, a spec. It gets translated into something the machine can run. This is automated by a compiler/interpreter (perhaps in multiple steps, e.g. when a VM is involved).
There have been several attempts over the years to automate the first step. COBOL was probably the first; since then we've had 4GLs, CASE tools, UML among others. They were all trying to close the gap: to take phase 1 specification closer to what non-developers can write - with the result automatically translated to working code.
Spec-driven development is another attempt at this. The translator (LLM) is quite different to previous efforts because it's non-deterministic. That brings some challenges but also offers opportunities to use input language that isn't constrained to be interpretable by conventional means (parsers implementing formal grammars).
We're in the early days of spec-driven. It may fail like its predecessors or it may not. But first order, there's nothing sacrosanct about the use of 3rd generation languages as the means to represent the specification. The pivotal challenge is whether translation from the starting specification can be reliably translated to working software.
If it can (big if) then economics will win out.
So they're more like 3rd party innovations to lobby LLM providers to integrate functionalities.
X prompting method/coding behaviors? Integrated. Media? Integrated. RAG? Integrated. Coding environment? Integrated. Agents? Integrated. Spec-driven development? It's definitely present, perhaps not as formal yet.
Same is true for UX and DevOps, just create a bunch of positions based on some blog post, and congratulate your self on a job well done. Screwing over the developer (engineers) as usual. Even though they actually might be interested in those jobs.
This is the main problem with big tech informing industry decisions, they win because they make sure they understand what all of this means. For all other companies this just creates a mess and your mentioned frustration.
This is exactly the same thing but for AIs. The user might think that the AI got it wrong, except the spec was under-specified and it had to make choices to fill in the gaps, just like a human would.
It’s all well and good if you don’t actually know what you want and you’re using the AI to explore possibilities, but if you already have a firm idea of what you want, just tell it in detail.
Maybe the article is actually about bad specs? It does seem to venture into that territory, but that isn’t the main thrust.
Overall I think this is just a part of the cottage industry that’s sprung up around agile, and an argument for that industry to stay relevant in the age of AI coding, without being well supported by anything.
The agent here is:
Look on HN for AI skeptical posts. Then write a comment that highlights how the human got it wrong. And command your other AI agents to up vote that reply.
such a rare (but valued!) occurrence in these posts. Thanks for sharing
Of course SDD/Waterfall helps the LLM/Outsourced labor to implement software in a predictable way. Waterfall was always a method to please Managers and in the case of SDD the manager is the user promoting the coding agent.
The problem with SDD/Waterfall is not the first part of the project. The problems come when you are deep into the project, your spec is a total mess and the tiniest feature you want to add requires extremely complex manipulation of the spec.
The success people are experiencing is the success managers have experienced at the beginning of their software projects. SDD will fail for the same reason Waterfall has failed. The constant increasing of complexity in the project, required to keep code and spec consistent can not be managed by LLM or human.
The problem with what people call "Waterfall" is that there is an assumption that at some point you have a complete and correct spec and you code off of that.
A spec is never complete. Any methodology applied in a way that does not allow you to go back to revise and/or clarify specs will cause trouble. This was possible with waterfall and is more explicitly encouraged with various agile processes. How much it actually happens in practice differs regardless of how you name the methodology that you use.
In contrast they're still the standard in the hardware design world.
it didn't really kill it - it just made the spec massively disjoint, split across hundreds to thousands of randomly filled Jira tickets.
Amazon's Kiro is incredibly spec driven. Haven't tried it but interested. Amplifier has a strong document-driven-development loop also built-in. https://github.com/microsoft/amplifier?tab=readme-ov-file#-d...
SDD as it's presented is a bit heavy weight, if you experimented with a bit, there is a lighter version that can work.
For some mini modules, we keep a single page spec as 'source of truth' instead of the code.
It's nice but has it's caveats but they are less of a concern over time.
4ndrewl•1h ago
a) the mulyi-year lead time from starting the spec to getting a finished product
b) no (cheap) way to iterate or deliver outside the spec
Neither of these are a problem with SDD.
mytailorisrich•48m ago
"Heavy documentation before coding" (article) is essentially a bad practice that Agile identified and proposed a remedy to.
Now the article is really about AI-driven development im which the AI agent is a "code monkey" that must be told precisely what to do. I think the interesting thing here will be do find the right balance... IMHO this works best when using LLMs only for small bits at a time instead of trying to specify the whole feature or product.
4ndrewl•21m ago
The key to Agile isn't documentation - it's in the ability to change at speed (perhaps as markets change). Literally "agile".
This approach allows for that comprehensive documentation without sacrificing agility.
mytailorisrich•5m ago
In addition, the big issue is when the comprehensive documentation is written first (as in waterfall) because it delays working software and feedback on how well the design works. Bluntly, this does not work.
That's why I think it is best to feed LLMs small chunks of work at a time and to keep the humam dev in the driving see to quickly iterate and experiment, and to be able to easily reason with the AI-generated code (who will do maintenance?)
The article seems to miss many of those points.
eric-burel•48m ago
yoz-y•22m ago
It's a bit funny to see people describe a spec written in days (hours) and iterations lasting multiple weeks as "waterfall".
But these days I've already had people argue that barely stopping to think about a problem before starting to prompt a solution is "too tedious of a process".
yoz-y•12m ago
They both have issues but they are very different. A waterfall project would have inscrutable structure and a large amount of "open doors" just in case a need of an extension at some place would materialize. Paradoxically this makes the code difficult to extend and debug because of overdone abstractions.
Hasty agile code has too many TODOs with "put this hardcoded value in a parameter". It is usually easier to add small features but when coming to a major design flaw it can be easier to throw everything out.
For UI code, AI seems to heavily tend towards the latter.
laserlight•7m ago
The detailed spec is exactly the problem with the waterfall development. The spec presumes that it is the solution, whereas Agile says “Heck, we don't even understand our problem well, let alone a solution to it.”
Beginning with a detailed spec fast with an LLM already puts you into a complex solution space, which is difficult to navigate compared to a simpler solution space. Regardless of the iteration speed, waterfall is the method that puts you into a complex space. Agile is the one you begin with smaller spaces to arrive at a solution.