Get Shit Done: A Meta-Prompting, Context Engineering and Spec-Driven Dev System

https://github.com/gsd-build/get-shit-done

71•stefankuehnel•1h ago

Comments

prakashrj•1h ago

With GSD, I was able to write 250K lines of code in less than a month, without prior knowledge of claude.

rsoto2•1h ago

I could copy 250k lines from github.

Faster than using ai. Cheaper. Code is better tested/more secure. I can learn/build with other humans.

prakashrj•56m ago

This is how I test my code currently.

  1. Backend unit tests — fast in-memory tests that run the full suite in ~5 seconds on every save.                                                                 
  2. Full end-to-end tests — automated UI tests that spin up a real cloud server, run through the entire user journey (provision → connect → manage → teardown), and
   verify the app behaves correctly on all supported platforms (phone, tablet, desktop).                                                                            
  3. Screenshot regression tests — every E2E run captures named screenshots and diffs them against saved baselines. Any unintended UI change gets caught            
  automatically.

prakashrj•54m ago

I was not a app developer before, but a systems engineer with devops experience. But I learnt a lot about apple development, app store connect and essential became a app developer in a month. I don't think I can learn so quickly with other humans help.

0x696C6961•13m ago

If you lost access to AI would you be able to continue development on your app?

wslh•1h ago

250K? Could you expand your experience with details about your project and the lessons and issues you found?

prakashrj•1h ago

A self-hosted VPN server manager: a TypeScript/Hono backend that runs on your own VPS, paired with a SwiftUI iOS/macOS app. It lets you provision cloud servers across multiple providers (Hetzner, DigitalOcean, Vultr), manage them via a Tailscale-secured connection with TLS pinning, and control an OpenClaw gateway.

I will open source it soon in few weeks, as I have still complete few more features.

prakashrj•52m ago

It's important to build a local dev environment that GSD can iterate on. Once I have done that, I just discuss with GSD and few hours later features land.

MeetingsBrowser•1h ago

I've tried it, and I'm not convinced I got measurably better results than just prompting claude code directly.

It absolutely tore through tokens though. I don't normally hit my session limits, but hit the 5-hour limits in ~30 minutes and my weekly limits by Tuesday with GSD.

testycool•57m ago

Same experience on multiple occasions.

greenchair•1h ago

terrible name, DOA

obsidianbases1•1h ago

> If you know clearly what you want

This is the real challenge. The people I know that jump around to new tools have a tough time explaining what they want, and thus how new tool is better than last tool.

boringg•1h ago

What do you think drives the tooling ecosystem aside from VC dollars?

jauntywundrkind•36m ago

These are incredible new superpowers. The LLMs let us do far far more than we could before. But it creates information glut, doesn't come with in built guards to prevent devolution from setting in. It feels unsurprising but also notable that a third of what folks are suddenly building is harness/prompting/coordination systems, because it's all trying to adapt & figure out process shapes for using these new superpowers well in.

There's some VC money interest but I'd classify more than 9 / 10ths of it as good old fashioned wildcat open source interest. Because it's fascinating and amazing, because it helps us direct our attention & steer our works.

And also it's so much more approachable and interesting, now that it's all tmux terminal stuff. It's so much more direct & hackable than, say, wading into vscode extension building, deep in someone else's brambly thicket of APIs, and where the skeleton is already in place anyhow, where you are only grafting little panes onto the experience rather than recasting the experience. The devs suddenly don't need or care for or want that monolithic big UI, and have new soaring freedom to explore something much nearer to them, much more direct, and much more malleable: the terminal.

There's so many different forms of this happening all at once. Totally different topic, but still in the same broad area, submitted just now too: Horizon, an infinite canvas for trrminals/AI work. https://github.com/peters/horizon https://news.ycombinator.com/item?id=47416227

dfltr•1h ago

GSD has a reputation for being a token burner compared to something like Superpowers. Has that changed lately? Always open to revisiting things as they improve.

maccam912•1h ago

I've had a good experience with https://github.com/obra/superpowers. At first glance this looks similar. Has anyone tried both who can offer a comparison?

observationist•43m ago

It's one of those things where having a structure is really helpful - I've used some similar prompt scaffolds, and the difference is very noticeable.

Another great technique is to use one of these structures in a repo, then task your AI with overhauling the framework using best practices for whatever your target project is. It works great for creative writing, humanizing, songwriting, technical/scientific domains, and so on. In conjunction with agents, these are excellent to have.

I think they're going to be a temporary thing - a hack that boosts utility for a few model releases until there's sufficient successful use cases in the training data that models can just do this sort of thing really well without all the extra prompting.

These are fun to use.

yolonir•40m ago

I've used both From my experience, gsd is a highly overengineered piece of software that unfortunately does not get shit done, burns limits and takes ages while doing so. Quick mode does not really help because it kills the point of gsd, you can't build full software on ad-hocs. I've used plain markdown planning before, but it was limiting and not very stable, superpowers looks like a good middleground

gbrindisi•55m ago

I like openspec, it lets you tune the workflow to your liking and doesn’t get in the way.

I started with all the standard spec flow and as I got more confident and opinionated I simplified it to my liking.

I think the point of any spec driven framework is that you want to eventually own the workflow yourself, so that you can constraint code generation on your own terms.

yoaviram•44m ago

I've been using GSD extensively over the past 3 months. I previously used speckit, which I found lacking. GSD consistently gets me 95% of the way there on complex tasks. That's amazing. The last 5% is mostly "manual" testing. We've used GSD to build and launch a SaaS product including an agent-first CMS (whiteboar.it).

It's hard to say why GSD worked so much better for us than other similar frameworks, because the underlying models also improved considerably during the same period. What is clear is that it's a huge productivity boost over vanilla Claude Code.

gtirloni•25m ago

I was using this and superpowers but eventually, Plan mode became enough and I prefer to steer Claude Code myself. These frameworks are great for fire-and-forget tasks, especially when there is some research involved but they burn 10x more tokens, in my experience. I was always hitting the Max plan limits for no discernable benefit in the outcomes I was getting. But this will vary a lot depending on how people prefer to work.

jghn•10m ago

I've gone the other way recently, shifting from pure plan mode to superpowers. I was reminded of it due to the announcement of the latest version.

It is perhaps confirmation bias on my part but I've been finding it's doing a better job with similar problems than I was getting with base plan mode. I've been attributing this to its multiple layers of cross checks and self-reviews. Yes, I could do that by hand of course, but I find superpowers is automating what I was already trying to accomplish in this regard.

Andrei_dev•23m ago

250K lines in a month — okay, but what does review actually look like at that volume?

I've been poking at security issues in AI-generated repos and it's the same thing: more generation means less review. Not just logic — checking what's in your .env, whether API routes have auth middleware, whether debug endpoints made it to prod.

You can move that fast. But "review" means something different now. Humans make human mistakes. AI writes clean-looking code that ships with hardcoded credentials because some template had them and nobody caught it.

All these frameworks are racing to generate faster. Nobody's solving the verification side at that speed.

kace91•11m ago

Code is a cost. It seems everyone's forgotten.

Saying "I generated 250k lines" is like saying "I used 2500 gallons of gas". Cool, nice expense, but where did you get? Because it it's three miles, you're just burning money.

250k lines is roughly SQLite or Redis in project size. Do you have SQLite-maintaining money? Did you get as far as Redis did in outcomes?

lielcohen•11m ago

This is the key insight. The generation vs. verification speed gap is a fundamental architectural problem with single-agent workflows. When one agent writes 250K lines, the verification bottleneck isn't just about running tests - it's about catching the things tests don't cover: hardcoded credentials, missing auth middleware, debug endpoints in prod. One approach that works well is splitting generation and verification into separate agents with different system prompts and ideally different models. The 'verifier' agent only sees the spec and the output code, never the generation context. It catches a surprising amount of the 'looks clean but is broken' issues because it doesn't share the same blind spots as the generator.

mbb70•8m ago

https://news.ycombinator.com/newsguidelines.html#generated

dhorthy•15m ago

it is very hard for me to take seriously any system that is not proven for shipping production code in complex codebases that have been around for a while.

I've been down the "don't read the code" path and I can say it leads nowhere good.

I am perhaps talking my own book here, but I'd like to see more tools that brag about "shipped N real features to production" or "solved Y problem in large-10-year-old-codebase"

I'm not saying that coding agents can't do these things and such tools don't exist, I'm just afraid that counting 100k+ LOC that the author didn't read kind of fuels the "this is all hype-slop" argument rather than helping people discover the ways that coding agents can solve real and valuable problems.

arjie•6m ago

I could not produce useful output from this. It was useful as a rubber duck because it asks good motivating questions during the plan phase, but the actual implementation was lacklustre and not worth the effort. In the end, I just have Claude Opus create plans, and then I have it write them to memory and update it as it goes along and the output is better.

thr0waway001•5m ago

At the risk of sounding stupid what does the author mean by: “I’m not a 50-person software company. I don’t want to play enterprise theatre.” ?

Brave is overriding user choice

Arizona Files Criminal Charges Against Kalshi, the Prediction Site

Ask HN: Is anyone building write guarantees for agents working across tool

Tsdraw – a free modular drawing app

The 30% Rule in AI

Show HN: GitGlimpse – GitHub Action that generates UI/UX demos for your PRs

The Delegation Dilemma, When AI Becomes Your Best Employee

CVE-2026-31900, my 0-click RCE in the psf/black GitHub Action

Marvin Hagemeister, Luca Casonato, David Sherret and Phil Hawksworth left Deno

What If You Could Know What Your Judge Did Last Summer?

Why AI systems don't learn – On autonomous learning from cognitive science

Motorcycle makers in Japan race to go carbon-neutral

Tennessee Teens Sue Elon Musk's xAI over Child Sexual Abuse Images

Netanyahu Posts 'Proof of Life' Video as A.I. Sows Doubts About What's Real

RustCFML – A CFML interpreter written in Rust

Ask HN: How do you handle payments for AI agents?

I built a screen-free, storytelling toy with an ESP32

Italy warns stricken Russian tanker could explode in Med at any time

Garry Tan's Claude Code Setup

I kept getting surprise API bills from my agents

Common sense: not to be rejected, but to be mastered and overcome

Show HN: AgentMarket – API marketplace where AI agents buy and sell capabilities

Show HN: Soros – AI for geopolitical macro investing

OpenAI to Cut Back on Side Projects in Push to 'Nail' Core Business

A menu bar utility that transforms your clipboard – encode, format, and more

Tri-skill framework for routing, verification, and judgment hygiene

CodeSandbox: Deprecation Notice

Treasuries and other government bonds will keep selling off, BlackRock says

Catly Browser (GeckoView) Beats Chrome on Speedometer 1.0 (Samsung S21)

Conversational Software Engineering: Compiling Intent