frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

From specification to stress test: a weekend with Claude

https://www.juxt.pro/blog/from-specification-to-stress-test/
26•henrygarner•1h ago

Comments

henrygarner•1h ago
Over a weekend, between board games and time with my kids, Claude and I built a distributed system with Byzantine fault tolerance, strong consistency and crash recovery under arbitrary failures. I described the behaviour I wanted in Allium, worked through the bugs conversationally and didn't write a line of implementation code.
SPICLK2•27m ago
I don't see why you need to bring your kids into this, and as a parent any suggestion of being distracted by tech during time with the kids raises my suspicions.

We have a strict no laptops and no phones rule when the kids are around (unless we're specifically doing something with them using those tools - looking at the weather forecast, or looking up some information).

"I can prompt AI while playing with the kids" is not a future I want.

altmanaltman•28m ago
This blog post is a sophisticated piece of content marketing for a company called JUXT and their proprietary tool, "Allium." While the technical achievement is plausible, the framing is heavily distorted to sell a product.

Here is the breakdown of the flaws and the "BS" in the narrative.

1. The "I Didn't Write Code" Lie The author claims, "I didn't write a line of implementation code." The Flaw: He wrote 3,000 lines of "Allium behavioural specification." The BS: Writing 3,000 lines of a formal specification language is coding. It’s just coding in a proprietary, high-level language instead of Kotlin.

The Ratio is Terrible: The post admits the output was ~5,500 lines of Kotlin. That means for every 1 line of spec, he got roughly 1.8 lines of code.

Why this matters: True "low-code" or "no-code" leverage is usually 1:10 or 1:100. If you have to write 3,000 lines of strict logic to get a 5,000-line program, you haven't saved much effort—you've just swapped languages.

2. The "Weekend Project" Myth The post frames this as a casual project done "between board games and time with my kids." The Flaw: This timeline ignores the massive "pre-computation" done by the human. The BS: To write 3,000 lines of coherent, bug-free specifications for a Byzantine Fault Tolerant (BFT) system, you need to have the entire architecture fully resolved in your head before you start typing. The author is an expert (CTO level) who likely spent weeks or years thinking about these problems. The "48 hours" only counts the typing time, not the engineering time.

3. The "Byzantine Fault Tolerance" (BFT) Bait-and-Switch The headline claims "Byzantine fault tolerance," which implies a system that continues to operate correctly even if nodes lie or act maliciously (extremely hard to build). The Flaw: A "Resolved Question" block in the text admits: "The system's goal is Byzantine fault detection, not classical BFT consensus." The BS: Real BFT (like PBFT or Raft with signatures) is mathematically rigorous and keeps the system running. "Fault Detection" just means "if the two copies don't match, stop." That is significantly easier to build. Calling it "BFT" in the intro is a massive overstatement of the system's resilience.

4. The "Maintenance Nightmare" (The Vendor Lock-in Trap) The post glosses over how this system is maintained. The Flaw: You now have 5,500 lines of Kotlin that no human wrote. The BS: This is the "Model Driven Architecture" (MDA) trap from the early 2000s.

Scenario: You find a bug in the Kotlin code.

Option A: You fix the Kotlin. Result: Your code is now out of sync with the Spec. You can never regenerate from Spec again without losing your fix.

Option B: You fix the Spec. Result: You hope the AI generates the exact Kotlin fix you need without breaking 10 other things.

The Reality: You are now 100% dependent on the "Allium" tool and Claude. If you stop paying for Allium, you have a pile of unmaintainable machine-generated code.

5. The Performance "Turning Point" The dramatic story about 10,000 Requests Per Second (RPS) has a hole in it. The Flaw: The "bottleneck" wasn't the code; it was a Docker proxy setting (gvproxy). The BS: This is a standard "gotcha" for anyone using Docker on Mac. Framing this as a triumph of AI debugging is a stretch—any senior engineer would check network topology when seeing high latency but low CPU usage. 10k RPS is also not "ambitious" for a modern distributed system; a single well-optimized Node.js or Go process can handle that easily.

antonly•5m ago
Hello, could you please put your over-sensationalized, overly-long, AI-generated comments somewhere else? Thank you.

Kindly, the HN Community.

bandrami•1m ago
> Your code is now out of sync with the Spec

Is there even a sync to be had? The same prompt to the same LLM at different times will yield different artifacts, even if you were to save and re-use the seed.

amarble•10m ago
As a counterpoint, I also tried writing something with Claude last weekend: a Google docs clone[1]. I spent $170 on Anthropic API credits, and got something that did mostly what I asked but was basically useless. It seems that for simple interfaces for which there is an exact specification, like the recent compiler and web browser examples, it's possible to write bigger projects that "work" as a demo although not in a way they'd be viable alternatives. For anything that requires taste and judgment, we've still got a long way to go. There are lots of great demos out there but few if any real examples of vibe coded (or whatever you want to call it) software standing alone as an alternative to project people wrote.

[1] https://www.marble.onl/posts/this_cost_170.html

Startup Investment Tracker for Europe

https://www.newsider.com/
1•wrahim•21s ago•0 comments

Working Yourself Out of a Job

https://dak.dev/blog/working-yourself-out-of-a-job
1•vinlock•13m ago•0 comments

Show HN: A CODEOWNERS management cli in Rust

https://github.com/code-input/cli
2•codeinput•15m ago•0 comments

Show HN: Geo Racers – Race from London to Tokyo on a single bus pass

https://geo-racers.com/
2•pattle•19m ago•0 comments

ANSI Escape Code Injection in OpenAI's Codex CLI

https://dganev.com/posts/2026-02-12-ansi-escape-injection-codex-cli/
2•syl5x•20m ago•1 comments

Turn Security Threats: A Hacker's View

https://www.enablesecurity.com/blog/turn-server-security-threats/
1•obscure6•24m ago•0 comments

Pseudonyms Used by Donald Trump

https://en.wikipedia.org/wiki/Pseudonyms_used_by_Donald_Trump
1•KoftaBob•27m ago•0 comments

Show HN: Vibe Deploy... Deploy full-stack apps to your own servers via AI

https://runos.com/blog/vibe-deploy.html
1•didierbreedt•31m ago•2 comments

Only use agents for tasks you know how to do

https://zknill.io/posts/only-ai-tasks-you-know-how-to-do/
1•zknill•31m ago•0 comments

Scripting on the JVM with Java, Scala, and Kotlin

https://mill-build.org/blog/19-scripting-on-the-jvm.html
1•lihaoyi•33m ago•0 comments

DeepSeek with 1M context window is loaded for testing

https://chat.deepseek.com/share/a17q3v4uja85p6sqon
1•HowardMei•35m ago•1 comments

Show HN: SuperLocalMemory– Local-first AI memory for Claude, Cursor and 16+tools

https://github.com/varun369/SuperLocalMemoryV2
1•varunpratap369•36m ago•0 comments

AI researchers are sounding the alarm on their way out the door

https://www.cnn.com/2026/02/11/business/openai-anthropic-departures-nightcap
1•tigerlily•37m ago•0 comments

Show HN: Threadlink – Turn long AI chats into portable context cards

https://threadlink.xyz/
1•skragus•40m ago•1 comments

Show HN: I built an open-source About A macOS style photo manager for Windows

https://github.com/OliverZhaohaibin/iPhotron-LocalPhotoAlbumManager
1•main-protect•42m ago•0 comments

Subscription-Based API Throttling Without Client API Keys

https://metaduck.com/subscription-based-api-throttling-without-client-api-keys/
1•pgte•51m ago•0 comments

Show HN: I Made a Math Crossword

https://do-say-go.github.io/hexfiend/?hn=4
2•keepamovin•51m ago•0 comments

Tomorrow, and tomorrow – Ian McKellen analyzes Macbeth speech (1979) [video]

https://www.youtube.com/watch?v=zGbZCgHQ9m8
1•fyredge•54m ago•0 comments

Show HN: Graphthulhu – Knowledge Graph MCP Server for Logseq and Obsidian

https://github.com/skridlevsky/graphthulhu
1•skridlevsky•58m ago•0 comments

Show HN: I made a game where you factor RSA

https://do-say-go.github.io/insights/
1•keepamovin•59m ago•1 comments

What to do before thinking hard

https://timktitarev.wordpress.com/2026/02/12/what-to-do-before-thinking-hard/
1•tim-kt•1h ago•1 comments

ComponentPro Mail library – This website used to sell stolen software

https://www.componentpro.com/products/mail/
1•ZeljkoS•1h ago•0 comments

Why am I unable to submit posts containing web addresses?

1•main-protect•1h ago•1 comments

Ask HN: How do you prevent sensitive data leaks in screen-recorded demos?

2•gongjunhao•1h ago•3 comments

Majutsu An Emacs interface for Jujutsu / jj, like Magit

https://github.com/0WD0/majutsu
2•fanf2•1h ago•0 comments

Two Autonomous Claudes, Full System Access, No Instructions. An Experiment

https://codingsoul.org/2026/02/12/two-autonomous-claudes-full-system-access-no-instructions-an-ex...
1•hleichsenring•1h ago•1 comments

Show HN: Kreuzberg Comparative Benchmarks

https://kreuzberg.dev/benchmarks
1•nhirschfeld•1h ago•0 comments

Data Influence

https://daeus.blog/2025/12/28/data-influence/
1•haakonhr•1h ago•0 comments

xAI's Moonshot Meeting:Billion-Image Floods, and a Lunar AI Factory (No, Really)

https://kirkstechtips.com/xais-moonshot-meeting-layoffs-billion-image-floods-and-a-lunar-ai-facto...
2•dudexsnave•1h ago•1 comments

Recreated the Nier Automata UI in React

https://yorha-design.vercel.app/
3•p0u4a•1h ago•0 comments