frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

AI will make formal verification go mainstream

https://martin.kleppmann.com/2025/12/08/ai-formal-verification.html
455•evankhoury•8h ago•220 comments

alpr.watch

https://alpr.watch/
695•theamk•12h ago•340 comments

No Graphics API

https://www.sebastianaaltonen.com/blog/no-graphics-api
503•ryandrake•10h ago•91 comments

Announcing the Beta release of ty

https://astral.sh/blog/ty
426•gavide•8h ago•81 comments

GPT Image 1.5

https://openai.com/index/new-chatgpt-images-is-here/
371•charlierguo•11h ago•184 comments

Pricing Changes for GitHub Actions

https://resources.github.com/actions/2026-pricing-changes-for-github-actions/
564•kevin-david•12h ago•638 comments

VA Linux: The biggest dotcom IPO

https://dfarq.homeip.net/va-linux-the-biggest-dotcom-ipo/
15•giuliomagnifico•5d ago•0 comments

Introduction to Software Development Tooling (2024)

https://bernsteinbear.com/isdt/
46•vismit2000•4h ago•4 comments

I ported JustHTML from Python to JavaScript with Codex CLI and GPT-5.2 in hours

https://simonwillison.net/2025/Dec/15/porting-justhtml/
107•pbowyer•6h ago•64 comments

Show HN: Titan – JavaScript-first framework that compiles into a Rust server

https://www.npmjs.com/package/@ezetgalaxy/titan
18•soham_byte•5d ago•7 comments

No AI* Here – A Response to Mozilla's Next Chapter

https://www.waterfox.com/blog/no-ai-here-response-to-mozilla/
198•MrAlex94•7h ago•118 comments

40 percent of fMRI signals do not correspond to actual brain activity

https://www.tum.de/en/news-and-events/all-news/press-releases/details/40-percent-of-mri-signals-d...
419•geox•15h ago•179 comments

Mozilla appoints new CEO Anthony Enzor-Demeo

https://blog.mozilla.org/en/mozilla/leadership/mozillas-next-chapter-anthony-enzor-demeo-new-ceo/
463•recvonline•15h ago•718 comments

Various locale mismatch scenarios in Windows clipboard text format synthesis

https://devblogs.microsoft.com/oldnewthing/20251211-37/?p=111858
4•ibobev•4d ago•0 comments

Sei AI (YC W22) Is Hiring

https://www.ycombinator.com/companies/sei/jobs/TYbKqi0-llm-engineer-mid-senior
1•ramkumarvenkat•4h ago

Thin desires are eating life

https://www.joanwestenberg.com/thin-desires-are-eating-your-life/
392•mitchbob•1d ago•157 comments

Testing a cheaper laminar flow hood

https://chillphysicsenjoyer.substack.com/p/testing-a-cheaper-laminar-flow-hood
30•surprisetalk•4d ago•6 comments

Dafny: Verification-Aware Programming Language

https://dafny.org/
46•handfuloflight•6h ago•23 comments

Japan to revise romanization rules for first time in 70 years

https://www.japantimes.co.jp/news/2025/08/21/japan/panel-hepburn-style-romanization/
155•rgovostes•20h ago•133 comments

Show HN: Learn Japanese contextually while browsing

https://lingoku.ai/learn-japanese
40•englishcat•4h ago•20 comments

Sega Channel: VGHF Recovers over 100 Sega Channel ROMs (and More)

https://gamehistory.org/segachannel/
238•wicket•16h ago•38 comments

The World Happiness Report is beset with methodological problems

https://yaschamounk.substack.com/p/the-world-happiness-report-is-a-sham
103•thatoneengineer•1d ago•123 comments

Nvidia Nemotron 3 Family of Models

https://research.nvidia.com/labs/nemotron/Nemotron-3/
170•ewt-nv•1d ago•30 comments

Chat-tails: Throwback terminal chat, built on Tailscale

https://tailscale.com/blog/chat-tails-terminal-chat
71•nulbyte•8h ago•12 comments

Writing a blatant Telegram clone using Qt, QML and Rust. And C++

https://kemble.net/blog/provoke/
98•tempodox•13h ago•58 comments

Twin suction turbines and 3-Gs in slow corners? Meet the DRG-Lola

https://arstechnica.com/cars/2025/11/an-electric-car-thats-faster-than-f1-around-monaco-thats-the...
10•PaulHoule•5d ago•3 comments

A Guide to Magnetizing N48 Magnets in Ansys Maxwell

https://blog.ozeninc.com/resources/from-datasheet-to-demagnetization-a-guide-to-magnetizing-n48-m...
4•peter_d_sherman•1h ago•0 comments

Show HN: Sqlit – A lazygit-style TUI for SQL databases

https://github.com/Maxteabag/sqlit
129•MaxTeabag•1d ago•19 comments

Show HN: TheAuditor v2.0 – A “Flight Computer” for AI Coding Agents

https://github.com/TheAuditorTool/Auditor
17•ThailandJohn•15h ago•7 comments

Rust GCC backend: Why and how

https://blog.guillaume-gomez.fr/articles/2025-12-15+Rust+GCC+backend%3A+Why+and+how
173•ahlCVA•16h ago•98 comments
Open in hackernews

I ported JustHTML from Python to JavaScript with Codex CLI and GPT-5.2 in hours

https://simonwillison.net/2025/Dec/15/porting-justhtml/
106•pbowyer•6h ago

Comments

simonw•5h ago
I think the most interesting thing about this is how it demonstrates that a very particular kind of project is now massively more feasible: library porting projects that can be executed against implementation-independent tests.

The big unlock here is https://github.com/html5lib/html5lib-tests - a collection of 9,000+ HTML5 parser tests that are their own independent file format, e.g. this one: https://github.com/html5lib/html5lib-tests/blob/master/tree-...

The Servo html5ever Rust codebase uses them. Emil's JustHTML Python library used them too. Now my JavaScript version gets to tap into the same collection.

This meant that I could set a coding agent loose to crunch away on porting that Python code to JavaScript and have it keep going until that enormous existing test suite passed.

Sadly conformance test suites like html5lib-tests aren't that common... but they do exist elsewhere. I think it would be interesting to collect as many of those as possible.

cies•4h ago
This is an interesting case. It may be good to feed it to other model and see how they do.

Also: it may be interesting to port it to other languages too and see how they do.

JS and Py are but runtime-typed and very well "spoken" by LLMs. Other languages may require a lot more "work" (data types, etc.) to get the port done.

gwking•4h ago
I’ve idly wondered about this sort of thing quite a bit. The next step would seem to be taking a project’s implementation dependent tests, converting them to an independent format and verifying them against the original project, then conducting the port.
cr125rider•1h ago
I’ve got to imagine a suite of end to end tests (probably most common is fixture file in, assert against output fixture file) would be very hard to nail all of the possible branches and paths. Like the example here, thousands of well made tests are required.
skissane•1h ago
Give coding agent some software. Ask it to write tests that maximise code coverage (source coverage if you have source code; if not, binary coverage). Consider using concolic fuzzing. Then give another agent the generated test suite, and ask it to write an implementation that passes. Automated software cloning. I wonder what results you might get?
heavyset_go•4h ago
This is one of the reasons I'm keeping tests to myself for a current project. Usually I release libraries as open source, but I've been rethinking that, as well.
simonw•4h ago
Oddly enough my conclusion is the opposite: I should invest more of my open source development work in creating language-independent test suites, because they can be used to quickly create all sorts of useful follow-on projects.
heavyset_go•2h ago
I'm not that generous with my time lol
cortesoft•2h ago
Isn't the point that you might be one of the people who benefits from one of those follow on projects? That is kind of the whole point of open source.

Why are you making your stuff open source in the first place if you don't want other people to build off of it?

bgwalter•1h ago
Open source has three main purposes, in decreasing order of importance:

1) Ensuring that there is no malicious code and enabling you to build it yourself.

2) Making modifications for yourself (Stallman's printer is the famous example).

3) Using other people's code in your own projects.

Item 3) is wildly over-propagandized as the sole reason for open source. Hard forks have traditionally led to massive flame wars.

We are now being told by corporations and their "AI" shills that we should diligently publish everything for free so the IP thieves can profit more easily. There is no reason to oblige them. Hiding test suites in order to make translations more difficult is a great first step.

visarga•14m ago
I think the only non-slop parts of the web are: open source, wikipedia, arXiv, some game worlds and social network comments in well behaved/moderated communities. What do they share in common? They all allow building on top, they are social first, people come together for interaction and collaboration.

The rest is enshittified web, focused on attention grabbing, retention dark patterns and misinformation. They all exist to make a profit off our backs.

A pattern I see is that we moved on from passive consumption and now want interactivity, sociality and reuse. We like to create together.

heavyset_go•1h ago
> Why are you making your stuff open source in the first place if you don't want other people to build off of it?

Because I enjoy the craft. I will enjoy it less if I know I'm being ripped off, likely for profit, hence my deliberate choices of licenses, what gets released and what gets siloed.

I'm happy if someone builds off of my work, as long as it's on my own terms.

aadishv•3h ago
I wonder if this makes AI models particularly well-suited to ML tasks, or at least ML implementation tasks, where you are given a target architecture and dataset and have to implement and train the given architecture on the given dataset. There are strong signals to the model, such as loss, which are essentially a slightly less restricted version of "tests".
simonw•3h ago
I'm certain this is the case. Iterating on ML models can actually be pretty tedious - lots of different parameters to try out, then you have to wait a bunch, then exercise the models, then change parameters and try again.

Coding agents are fantastic at these kinds of loops.

montroser•1h ago
We've been doing this at work a bunch with great success. The most impressive moment to me was when the model we were training did a type of overfitting, and rather than just claiming victory (as it all too often) this time Claude went and just added a bunch more robust, human-grade examples to our training data and hold out set, and kept iterating until the model effectively learned the actual crux of what we were trying to teach it.
cxr•5h ago
Few know that Firefox's HTML5 parser was originally written in Java, and only afterward semi-mechanically translated (pre-LLMs) to the dialect of C++ used in the Gecko codebase.

This blog post isn't really about HTML parsers, however. The JustHTML port described in this blog post was a worthwhile exercise as a demonstration on its own.

Even so, I suspect that for this particular application, it would have been more productive/valuable to port the Java codebase to TypeScript rather than using the already vibe coded JustHTML as a starting point. Most of the value of what is demonstrated by JustHTML's existence in either form comes from Stenström's initial work.

simonw•4h ago
There are certainly dozens of better ways to do what I did here.

I picked JustHTML as a base because I really liked the API Emil had designed, and I also thought it would be darkly amusing to take his painstakingly (1,000+ commits, 2 months+ of work) constructed library and see if I could port it directly to Python in an evening, taking advantage of everything he had already figured out.

simonw•4h ago
Whoa... it looks like the Firefox HTML5 parser is still maintained as Java to this day!

Here's the relevant folder:

https://github.com/mozilla-firefox/firefox/tree/main/parser/...

  make translate        # perform the Java-to-C++ translation from the remote
                        # sources
And active commits to that javasrc folder - the last was in November: https://github.com/mozilla-firefox/firefox/commits/main/pars...
cxr•4h ago
I have secretly held the belief for a while that the Java implementation should be mechanically translated to TypeScript and then fixed up, annotated, and maintained not just primarily but entirely in that form; the requisite R&D/tooling should be created to:

(a) permit a fully mechanical, on-the-fly rederivation of the canonical TypeScript sources into Java, for Java consumers that need it (a lot like the ts->js step that happens for execution on JS engines), and

(b) compiler support that can go straight from the TypeScript subset used in the parser to a binary that's as performant as the current native implementation, without requiring any intermediate C++ form to be emitted or reviewed/vetted/maintained by hand

(Sidenote: Hejlsberg is being weird/not entirely forthcoming about the overall goals wrt the announcement last year about porting the TypeScript compiler to Go. We're due for an announcement that they've done something like lifted the Go compilers' backends out of the golang.org toolchain, strapped the legacy tsc frontend on top, allowing the TypeScript compiler to continue to be developed and maintained in TypeScript while executing with the performance previously seen mostly with tools written in Go vs those making do with running on V8.)

I agree with the overall conclusion of the post that what is demonstrated there is a good use case for LLMs. It might even be the best use for them, albeit something to be undertaken/maintained as part of the original project. It wouldn't be hugely surprising if that turned out to be the dominant use of LLM-powered coding assistants when everything shakes out (all the other promises that have been made for and about them notwithstanding).

No real reason that they couldn't play a significant role in the project I outlined above.

simonw•3h ago
I just blogged about this https://simonwillison.net/2025/Dec/17/firefox-parser/

... and then when I checked the henri-sivonen tag https://simonwillison.net/tags/henri-sivonen/ found out I'd previously written about the exact same thing 16 years earlier!

QuantumNomad_•3h ago
IANAL. In my opinion, porting code to a different language is still derivative work of the code you are porting it from. Whether done by hand or with an LLM. And in my opinion, the license of the original code still applies. Which means that not only should one link to the repo for the code that was ported, but also make sure to adhere to the terms to the license.

The MIT family of licenses state that the copyright notice and terms shall be included in all copies of the software.

Porting code to a different language is in my opinion not much different from forking a project and making changes to it, small or big.

I therefore think the right thing to do is to keep the original copyright notice and license file, and adding your additional copyright line to it.

So for example if the original project had an MIT license file that said

Copyright 2019 Suchandsuch

Permission is hereby granted and so on

You should keep all of that and add your copyright year and author name on the next line after the original line or lines of the authors of the repo you took the code from.

simonw•3h ago
I added Emil to my license file: https://github.com/simonw/justjshtml/blob/main/LICENSE

I'm not certain I should add the html5ever copyright holders, since I don't have a strong understanding of how much of their IP ended up in Emil's work - see https://news.ycombinator.com/item?id=46264195#46267059

f311a•5h ago
From original repository:

     Verified Compliance: Passes all 9k+ tests in the official html5lib-tests suite (used by browser vendors).
Yes, browsers do you use it. But they handle a lot of stuff differently.

    selectolax  68%  No  Very Fast  CSS selectors C-based (Lexbor). Very fast but less compliant.
The original author compares selectolax to html5lib-tests, but the reality is that when you compare selectolax to Chrome output, you get 90%+.

One of the tests:

  INPUT: <svg><foreignObject></foreignObject><title></svg>foo
It fails for selectolax:

  Expected:
  | <html>
  |   <head>
  |   <body>
  |     <svg svg>
  |       <svg foreignObject>
  |       <svg title>
  |     "foo"
  Actual:
  | <html>
  |   <head>
  |   <body>
  |     <svg>
  |       <foreignObject>
  |       <title>
  |     "foo"

But you get this in Chrome and selectolax:

    <html><head></head><body><svg><foreignObject></foreignObject><title></title></svg>foo
    </body></html>
minimaxir•4h ago
My opinion on the ending open questions:

> Does this library represent a legal violation of copyright of either the Rust library or the Python one? Even if this is legal, is it ethical to build a library in this way?

Currently, I am experimenting with two projects in Claude Code: a Rust/Python port of a Python repo which necessitates a full rewrite to get the desired performance/feature improvements, and a Rust/Python port of a JavaScript repo mostly because I refuse to install Node (the speed improvement is nice though).

In both of those cases, the source repos are permissively licensed (MIT), which I interpret as the developer intent as to how their code should used. It is in the spirit of open source to produce better code by iterating on existing code, as that's how the software ecosystem grows. That would be the case whether a human wrote the porting code or not. If Claude 4.5 Opus can produce better/faster code which has the same functionality and passes all the tests, that's a win for the ecosystem.

As courtesy and transparency, I will still link and reference the original project in addition to disclosing the Agent use, although those things aren't likely required and others may not do the same. That said, I'm definitely not using an agent to port any GPL-licensed code.

simonw•4h ago
That's about where I'm settled on this right now. I feel like authors who select the GPL have made a robust statement about their intent. It may be legal for me to copyright-launder their library (maybe using the trick where one LLM turns their code into a spec and another turns that spec into fresh code) but I wouldn't do that because it would subvert the spirit of the license.
throwup238•4h ago
> As courtesy and transparency, I will still link and reference the original project in addition to disclosing the Agent use, although those things aren't likely required and others may not do the same. That said, I'm definitely not using an agent to port any GPL-licensed code.

IANAL but regardless of the license, you have to respect their copyright and it’s hard to argue that an LLM ported library is anything but a derivative work. You would still have to include the original copyright notices and retain the license (again IANAL).

minimaxir•4h ago
A similar argument could be made about generative AI and whether text/image outputs themselves are derivative works, which is a legal point of contention still being argued. It's unclear if code text from a generative AI is in scope.
throwup238•3h ago
That’s a legal point of contention because the nature of language/image models is hard to fit into the existing copyright framework. That only really applies to cleanroom-ish one shot requests where the inference input doesn’t contain the copyrighted material in question.

It’s a lot easier to argue that it’s a derivative work when you feed the copyrighted code directly into the context and ask it to port it to another language. If the copyrighted code is literally an input to the inference request, that would not escape any judge’s notice. The law may not have any precedent for this technology but judges aren’t automatons beholden to trivially buggy code that can’t adapt.

swyx•4h ago
> How much better would this library be if an expert team hand crafted it over the course of several months?

i think the fun conclusion would be: ideally no better, and no worse. that is the state you arrive it IFF you have complete tests and specs (including probably for performance). now a human team handcrafting would undoubtedly make important choices not clarified in specs, thereby extending the spec. i would argue that human chain of thought from deep involvement in building and using the thing is basically 100% of the value of human handcrafting, because otherwise yeah go nuts giving it to an agent.

tantalor•4h ago
> Can I even assert copyright over this, given how much of the work was produced by the LLM?

No, because it's a derivative work of the base library.

simonw•4h ago
That doesn't sound right to me. If it's a derivative work I can still assert copyright over the modifications I have made, but not over the original material.
tantalor•4h ago
You're right that derivative works are copyrightable. I got that wrong.

I think you can claim the prompt itself. But you didn't create the new code. I'd argue copyright belongs to the original author.

simonw•4h ago
Something I'm particularly interested in understanding is where the tipping point here is. At what point is a prompt or the input that accompanies a prompt enough for the result to be copyrightable?

This project is the absolute extreme: I handed over exactly 8 prompts, and several of those were just a few words. I count the files on disk as part of the prompts, but those were authored by other people.

The US copyright office say "the resulting work is copyrightable only if it contains sufficient human-authored expressive elements" - https://perkinscoie.com/insights/update/copyright-office-sol... - but what does that actually mean?

Emil's JustHTML project involved several months of work and 1,000+ commits - almost all of the code was written by agents but there was an enormous amount of what I"d consider "human-authored expressive elements" guiding that work.

Many of my smaller AI-assisted projects use prompts like this one:

> Fetch https://observablehq.com/@simonw/openai-clip-in-a-browser and analyze it, then build a tool called is-it-a-bird.html which accepts a photo (selected or drag dropped or pasted) and instantly loads and runs CLIP and reports back on similarity to the word “bird” - pick a threshold and show a green background if the photo is likely a bird

Result: https://tools.simonwillison.net/is-it-a-bird

It was a short prompt, but the Observable notebook it references was authored by me several years ago. The agent also looked at a bunch of other files in my tools repo as part of figuring out what to build.

I think that counts as a great deal of "human-authored expressive elements" by me.

So yeah, this whole thing is really complicated!

tantalor•3h ago
This is, of course, forgetting the fact that the model was trained on heaps and heaps of copyrighted work.

Laying claim to anything generated is very likely to fail.

simonw•3h ago
If it turns out you can't copyright code that was generated with the help of LLMs a whole bunch of $billion+ companies are going to have to throw away 18+ months of their work.
brailsafe•2h ago
> If it turns out you can't copyright code that was generated with the help of LLMs a whole bunch of $billion+ companies are going to have to throw away 18+ months of their work.

Hmm, it is interesting to think about that situation. Intuitively it would seem to me like there's some nuance between whether work would need to be "thrown out" or whether it just can't be sold as their own creation, marking some kind of divide between code produced and used privately for commercial purposes vs code that is produced and sold/provided publicly as a commercial product. The risk in doing the latter, or entirely throwing out the code, seems like it would be a relatively cheap risk that those companies do anyway all the time.

However, if I as a small business owner made a tool to help other businesses based on LLM code that used some of my own prior work for context, then sold the code itself as a product or sold a product with it as a dependency, it would be a much greater liability for me if it turned out to include copyrighted && unlicensed work that was produced by an LLM that further can't be claimed as my own.

Privately, on servers or in internal tooling not sold commercially, it would perhaps be next to impossible to either identify or enforce those limits. Without explicit attribution to an agent, I have no idea (with certainty anyway) which code anyone on my team has produced with an LLM, and it's not available publicly—aside from pure frontend web stuff—so I wonder in what capacity it would even be possible to throw specific chunks out if it was hypothetically enforceable.

leprechaun1066•4h ago
In this case the majority of the work was done by another company on your instruction. When you signed up was there anything in the terms that said you get ownership over the output?
simonw•4h ago
All of the notable generative AI companies have policies that the won't claim copyright over your outputs.

They also frequently offer "liability shields" where their legal teams will go to bat for you if you get sued for copyright infringement based on your usage of their terms.

https://help.openai.com/en/articles/5008634-will-openai-clai...

https://www.anthropic.com/news/expanded-legal-protections-ap...

https://ai.google.dev/gemini-api/terms#use-generated

StarterPro•4h ago
YOU didn't port shit, the ai did all the work.
simonw•4h ago
That's kind of the whole point of this exercise and my write-up of it.
kjgkjhfkjf•3h ago
I'm glad you wrote it up. Thanks! But I feel like the folks behind the HTML5 spec and the comprehensive test suite deserve the lion's share of the credit for this (very neat) achievement.

Most projects don't have a detailed spec at the outset. Decades of experience have shown that trying to build a detailed spec upfront does not work out well for a vast class of projects. And many projects don't even have a comprehensive test suite when they go into production!

simonw•3h ago
I completely agree. I hope I gave them enough credit in the blog post and the GitHub repo.
kjgkjhfkjf•3h ago
Yep, and I think it is a great way to draw attention to their work!
mirthturtle•4h ago
Wild to ask, "Is it legal, ethical, responsible or even harmful to build in this way and publish it?" AFTER building and publishing it. Author made up his mind already, or doesn't actually care. Ethics and responsibility should guide one's actions, not just be engagement fodder after the fact.
simonw•4h ago
If I thought this was clear-cut 100% unethical and irresponsible I wouldn't have done it. I think there's ample room for conversation about this. I'd like to help instigate that conversation.

I'm ready to take a risk to my own reputation in order to demonstrate that this kind of thing is possible. I think it's useful to help people understand that this kind of thing isn't just feasible now, it's somewhat terrifyingly easy.

ethanpil•3h ago

  >  It took two initial prompts and a few tiny follow-ups. GPT-5.2 running in Codex CLI ran uninterrupted for several hours, burned through 1,464,295 input tokens, 97,122,176 cached input tokens and 625,563 output tokens and ended up producing 9,000 lines of fully tested JavaScript across 43 commits.
Using a random LLM cost calculator, this amounts to $28.31... pretty reasonable for functional output.

I am now confident that within 5-10 years (most/all?) junior & mid and many senior dev positions are going to drop out enormously.

Source: https://www.llm-prices.com/#it=1464295&cit=97123000&ot=62556...

elcritch•3h ago
This is for porting an existing project. It’s an ideal case for LLMs. The results are still pretty different for building up a library from scratch.

However this changes the economics for languages with smaller ecosystems!

afro88•2h ago
People say this kind of thing a lot, but in reality the concept of "software engineer" will change and there will still be experience levels with different expectations
almostgotcaught•2h ago
> I am now confident that within 5-10 years (most/all?) junior & mid and many senior dev positions are going to drop out enormously.

yes because this is what we do all day every day (port existing libraries from one language to another)....

like do y'all hear yourselves or what?

hatefulheart•1h ago
I’m afraid the boosters hear nothing.

The commenter you’re replying to, in their heart of hearts, truly believes in 5 years that an LLM will be writing the majority of the code for a project like say Postgres or Linux.

Worth bearing in mind the boosters said this 5 years ago, and will say this in 5 years time.

cjlm•3h ago
Not all AI-assisted ports are quite so successful[0]

[0] https://ammil.industries/the-port-i-couldnt-ship/

zamadatix•2h ago
I think a big factor (of many probably) is there is a ~150x difference in bytes of source vs number of tests for them. I.e. I wonder what other projects are easy wins, which are hard ones, and which can be accomplished quickly with a certain approach.

It'd be really interesting if Simon gave a crack at the above and wrote about his findings in doing so. Or at least, I'd find it interesting :).

WhyOhWhyQ•3h ago
<p>© 2024 Example</p>

^Claude still thinks it's 2024. This happens to me consistently.

bgwalter•3h ago
I think the decision of SQLite to keep its large test suite private is very wise in the presence of thieves.
aster0id•3h ago
> Code is so cheap it’s practically free. Code that works continues to carry a cost, but that cost has plummeted now that coding agents can check their work as they go.

I personally think that even before LLMs, the cost of code wasn't necessarily the cost of typing out the characters in the right order, but having a human actually understand it to the extent that changes can be made. This continues to be true for the most part. You can vibe code your way into a lot of working code, but you'll inevitably hit a hairy bug or a real world context dependency that the LLM just cannot solve, and that is when you need a human to actually understand everything inside out and step in to fix the problem.

monkpit•2h ago
I wonder if we will trend towards a world where maintainability is just a waste of time and money, when you can just knock together a new flimsy thing quicker and cheaper than maintaining one thing over multiple iterations.
skydhash•1h ago
I don’t think that will ever be true. Let’s take a shell session as an example of ad-hoc code: People are still writing programs and scripts. Stuff doesn’t really change that often to warrant starting from scratch. Easier to add a new format to a music player than writing a new player from scratch.
vessenes•1h ago
Couple quick points from the read - cool, btw! It's not trivial that Simon poked the LLM to get something up and running and working ASAP - that's always been a good engineering behavior in my opinion - building on a working core - but I have found it's extra helpful/needed when it comes to LLM coding - this brings the compiler and tests "in the loop" for the LLM, and helps keep it on the rails - otherwise you may find you get 1,000s of lines of code that don't work or are just sort of a goose chase, or all gilding of lilies.

As is mentioned in the comments, I think the real story here is two fold - one, we're getting longer uninterrupted productive work out of frontier models - yay - and a formal test suite has just gotten vastly more useful in the last few months. I'd love to see more of these made.

mNovak•32m ago
While this example is explicitly asking for a port (thus a copy), I also find in general that LLM's default behavior is to spit out new code from their vast pre-trained encyclopedia, vs adding an import to some library that already serves that purpose.

I'm curious if this will implicitly drive a shift in the usage of packages / libraries broadly, and if others think this is a good or bad thing. Maybe it cuts down the surface of upstream supply-chain attacks?

orange_puff•29m ago
This seems really impressive. I am too lazy to replicate this, but I do wonder how important the test suite is for a a port that likely uses straight forward, dependency free python code https://github.com/EmilStenstrom/justhtml/tree/main/src/just...

It is enormously useful for the author to know that the code works, but my intuition is if you asked an agent to port files slowly, forming its own plan, making commits every feature, it would still get reasonably close, if not there.

Basically, I am guessing that this impressive output could have been achieved based on how good models are these days with large amounts of input tokens, without running the code against tests.

xarope•25m ago
"If you can reduce a problem to a robust test suite you can set a coding agent loop loose on it with a high degree of confidence that it will eventually succeed"

I'm a bit sad about this; I'd rather have "had fun" doing the coding, and get AI to create the test cases, than vice versa.

teppic•24m ago
Fuck
visarga•20m ago
I think specs + tests are the new source of truth, code is disposable and rebuildable. A well tested project is reliable both for humans and AI, a badly tested one is bad for both. When we don't test well I call it "vibe testing, or LGTM testing"
febed•5m ago
What was your prompt to get it to run the test suite and heal tests at every step? I didn’t see that mentioned in your write up. Also, any specific reason you went with Codex over Claude Code?