frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

2025: The Year in LLMs

https://simonwillison.net/2025/Dec/31/the-year-in-llms/
139•simonw•2h ago•75 comments

I canceled my book deal

https://austinhenley.com/blog/canceledbookdeal.html
346•azhenley•8h ago•229 comments

Show HN: BusterMQ, Thread-per-core NATS server in Zig with io_uring

https://bustermq.sh/
32•jbaptiste•2h ago•3 comments

Scientists unlock brain's natural clean-up system for new treatments for stroke

https://www.monash.edu/pharm/about/news/news-listing/latest/scientists-unlock-brains-natural-clea...
77•PaulHoule•4h ago•14 comments

Warren Buffett steps down as Berkshire Hathaway CEO after six decades

https://www.latimes.com/business/story/2025-12-31/warren-buffett-steps-down-as-berkshire-hathaway...
400•ValentineC•5h ago•251 comments

Resistance training load does not determine hypertrophy

https://physoc.onlinelibrary.wiley.com/doi/10.1113/JP289684
57•Luc•4h ago•57 comments

All-optical synthesis chip for large-scale intelligent semantic vision

https://www.science.org/doi/10.1126/science.adv7434
56•QueensGambit•6h ago•9 comments

Demystifying DVDs

https://hiddenpalace.org/News/One_Bad_Ass_Hedgehog_-_Shadow_the_Hedgehog#Demystifying_DVDs
112•boltzmann-brain•2d ago•8 comments

Observed Agent Sandbox Bypasses

https://voratiq.com/blog/yolo-in-the-sandbox/
28•m-hodges•3d ago•16 comments

GoGoGrandparent (YC S16) Is Hiring Tech Leads

https://www.ycombinator.com/companies/gogograndparent/jobs/w2jGKM7-gogograndparent-yc-s16-is-hiri...
1•davidchl•1h ago

On privacy and control

https://toidiu.com/blog/2025-12-25-privacy-and-control/
146•todsacerdoti•8h ago•79 comments

Ÿnsect, a French insect farming startup, has been been placed into liquidation

https://techcrunch.com/2025/12/26/how-reality-crushed-ynsect-the-french-startup-that-had-raised-o...
69•fcpguru•5d ago•75 comments

My role as a founder-CTO: year 8

https://miguelcarranza.es/cto-year-8
99•ridruejo•5d ago•90 comments

Nerd: A language for LLMs, not humans

https://www.nerd-lang.org/about
34•gnanagurusrgs•1h ago•64 comments

The compiler is your best friend

https://blog.daniel-beskin.com/2025-12-22-the-compiler-is-your-best-friend-stop-lying-to-it
140•based2•11h ago•90 comments

PyPI in 2025: A Year in Review

https://blog.pypi.org/posts/2025-12-31-pypi-2025-in-review/
47•miketheman•7h ago•12 comments

Web Browsers have stopped blocking pop-ups

https://www.smokingonabike.com/2025/12/31/web-browsers-have-stopped-blocking-pop-ups/
67•coldpie•9h ago•58 comments

The Delete Act

https://privacy.ca.gov/drop/about-drop-and-the-delete-act/
105•weaksauce•2h ago•55 comments

Akin's Laws of Spacecraft Design (2011) [pdf]

https://www.ece.uvic.ca/~elec399/201409/Akin%27s%20Laws%20of%20Spacecraft%20Design.pdf
268•tosh•16h ago•82 comments

Scaffolding to Superhuman: How Curriculum Learning Solved 2048 and Tetris

https://kywch.github.io/blog/2025/12/curriculum-learning-2048-tetris/
120•a1k0n•10h ago•28 comments

Show HN: Use Claude Code to Query 600 GB Indexes over Hacker News, ArXiv, etc.

https://exopriors.com/scry
308•Xyra•19h ago•112 comments

When square pixels aren't square

https://alexwlchan.net/2025/square-pixels/
112•PaulHoule•13h ago•55 comments

The most famous transcendental numbers

https://sprott.physics.wisc.edu/pickover/trans.html
141•vismit2000•14h ago•84 comments

Microtonal Spiral Piano

https://shih1.github.io/spiral/
76•phoenix_ashes•5d ago•13 comments

Show HN: Frockly – A visual editor for understanding complex Excel formulas

40•jack_ruru•6d ago•9 comments

How AI labs are solving the power problem

https://newsletter.semianalysis.com/p/how-ai-labs-are-solving-the-power
126•Symmetry•13h ago•203 comments

Stewart Cheifet, creator of The Computer Chronicles, has died

https://obits.goldsteinsfuneral.com/stewart-cheifet
204•spankibalt•9h ago•62 comments

Nvidia GB10's Memory Subsystem, from the CPU Side

https://chipsandcheese.com/p/inside-nvidia-gb10s-memory-subsystem
71•ingve•14h ago•7 comments

The rise of industrial software

https://chrisloy.dev/post/2025/12/30/the-rise-of-industrial-software
218•chrisloy•17h ago•158 comments

Iron Beam: Israel's first operational anti drone laser system

https://mod.gov.il/en/press-releases/press-room/israel-mod-and-rafael-deliver-first-operational-h...
85•fork-bomber•12h ago•115 comments
Open in hackernews

2025: The Year in LLMs

https://simonwillison.net/2025/Dec/31/the-year-in-llms/
136•simonw•2h ago

Comments

AndyNemmity•2h ago
These are excellent every year, thank you for all the wonderful work you do.
tkgally•47m ago
Same here. Simon is one of the main reasons I’ve been able to (sort of) keep up with developments in AI.

I look forward to learning from his blog posts and HN comments in the year ahead, too.

waldrews•1h ago
Remember, back in the day, when a year of progress was like, oh, they voted to add some syntactic sugar to Java...
throwup238•1h ago
> they voted to add some syntactic sugar to Java...

I remember when we just wanted to rewrite everything in Rust.

Those were the simpler times, when crypto bros seemed like the worst venture capitalism could conjure.

OGEnthusiast•1h ago
Crypto bros in hindsight were so much less dangerous than AI bros. At least they weren't trying to construct data centers in rural America or prop up artificial stocks like $NVDA.
SauntSolaire•8m ago
Instead they were building crypto mining warehouses in rural America and propping up artificial currencies like BTC.
sanreau•1h ago
> Vendor-independent options include GitHub Copilot CLI, Amp, OpenHands CLI, and Pi

...and the best of them all, OpenCode[1] :)

[1]: https://opencode.ai

simonw•1h ago
Good call, I'll add that. I think I mentally scrambled it with OpenHands.
the_mitsuhiko•1h ago
Thanks for adding pi to it though :)
nineteen999•1h ago
How did I miss this until now! Thank you for sharing.
logicprog•31m ago
I don't know why you're downloaded, OpenCode is by far the best.
the_mitsuhiko•1h ago
> The (only?) year of MCP

I like to believe, but MCP is quickly turning into an enterprise thing so I think it will stick around for good.

simonw•1h ago
I think it will stick around, but I don't think it will have another year where it's the hot thing it was back in January through May.
Alex-Programs•37m ago
I never quite got what was so "hot" about it. There seems to be an entire parallel ecosystem of corporates that are just begging to turn AI into PowerPoint slides so that they can mould it into a shape that's familiar.
npalli•1h ago
Great summary of the year in LLMs. Is there a predictions (for 2026) blogpost as well?
simonw•1h ago
Given how badly my 2025 predictions aged I'm probably going to sit that one out! https://simonwillison.net/2025/Jan/10/ai-predictions/
skydhash•1h ago
Pretty much a whole year of nothing really. Just coming with a bunch of abstraction and ideas trying to solve an unsolvable problem. Getting reliable results from an unreliable process while assuming the process is reliable.

At least when herding cats, you can be sure that if the cats are hungry, they will try to get where the food is.

MattRix•1h ago
I’m not sure how to tell you how obvious it is you haven’t actually used these tools.
skydhash•1h ago
Why do people assume negative critique is ignorance?
dmd•1h ago
People denied that bicycles could possibly balance even as others happily pedaled by. This is the same thing.
measurablefunc•1h ago
Bicycles don't balance, the human on the bicycle is the one doing the balancing.
dmd•1h ago
Yes, that is the analogy I am making. People argued that bicycles (a tool for humans to use) could not possibly work - even as people were successfully using them.
measurablefunc•51m ago
People use drugs as well but I'm not sure I'd call that successful use of chemical compounds without further context. There are many analogies one can apply here that would be equally valid.
skydhash•59m ago
Please tell me which one of the headings is not about increased usage o LLMs and derived tools and is about some improvement in the axes of reliability or or any kind of usefulness.

Here is the changelog for OpenBSD 7.8:

https://www.openbsd.org/78.html

There's nothing here that says: We make it easier to use it more of it. It's about using it better and fixing underlying problems.

simonw•56m ago
The coding agent heading. Claude Code and tools like it represent a huge improvement in what you can usefully get done with LLMs.

Mistakes and hallucinations matter a whole lot less if a reasoning LLM can try the code, see that it doesn't work and fix the problem.

walt_grata•43m ago
If it actually does that without an argument. I can't believe I have to say that about a computer program
skydhash•40m ago
> The coding agent heading. Claude Code and tools like it represent a huge improvement in what you can usefully get done with LLMs.

Does it? It's all prompt manipulation. Shell script are powerful yes, but not really huge improvement over having a shell (REPL interface) to the system. And even then a lot of programs just use syscalls or wrapper libraries.

> can try the code, see that it doesn't work and fix the problem.

Can you really say that does happens reliably?

simonw•37m ago
Depends on what you mean by "reliably".

If you mean 100% correct all of the time then no.

If you mean correct often enough that you can expect it to be a productive assistant that helps solve all sorts of problems faster than you could solve them without it, and which makes mistakes infrequently enough that you waste less time fixing them than you would doing everything by yourself then yes, it's plenty reliable enough now.

dham•12m ago
You're welcome to try the LLM's yourself and come up with your own conclusions. By what you've posted it doesn't look like you've tried the anything in the last 2 years. Yes LLM's can be annoying, but there has been progress.
noodletheworld•38m ago
I know it seems like forever ago, but claude code only came out in 2025.

Its very difficult to argue the point that claude code:

1) was a paradigm shift in terms of functionality, despite, to be fair, at best, incremental improvements in the underlying models.

2) The results are an order of magnitude, I estimate, better in terms of output.

I think its very fair to distill “AI progress 2025” to: you can get better results (up to a point; better than raw output anyway; scaling to multiple agents has not worked) without better models with clever tools and loops. (…and video/image slop infests everything :p).

bandrami•29m ago
Did more software ship in 2025 than in 2024? I'm still looking for some actual indication of output here. I get that people feel more productive but the actual metrics don't seem to agree.
skydhash•20m ago
I'm still waiting for the Linux drivers to be written because of all the 20x improvements that AI hypers are touting. I would even settle for Apple M3 and M4 computers to be supported by Asahi.
noodletheworld•8m ago
I am not making any argument about productivity about using AI vs. not using AI.

My point is purely that, compared to 2024, the quality of the code produced by LLM inference agent systems is better.

To say that 2025 was a nothing burger is objectively incorrect.

Will it scale? Is it good enough to use professionally? Is this like self driving cars where the best they ever get is stuck with an odd shaped traffic cone? Is it actually more productive?

Who knows?

Im just saying… LLM coding in 2024 sucked. 2025 was a big year.

tehnub•55m ago
People did?
rhubarbtree•31m ago
It’s possible this is correct.

It’s also possible that people more experienced, knowledgable and skilled than you can see fundamental flaws in using LLMs for software engineering that you cannot. I am not including myself in that category.

I’m personally honestly undecided. I’ve been coding for over 30 years and know something like 25 languages. I’ve taught programming to postgrad level, and built prototype AI systems that foreshadowed LLMs, I’ve written everything from embedded systems to enterprise, web, mainframes, real time, physics simulation and research software. I would consider myself an 7/10 or 8/10 coder.

A lot of folks I know are better coders. To put my experience into context: one guy in my year at uni wrote one of the world’s most famous crypto systems; another wrote large portions of some of the most successful games of the last few decades. So I’ve grown up surrounded by geniuses, basically, and whilst I’ve been lectured by true greats I’m humble enough to recognise I don’t bleed code like they do. I’m just a dabbler. But it irks me that a lot of folks using AI profess it’s the future but don’t really know anything about coding compared to these folks. Not to be a Luddite - they are the first people to adopt new languages and techniques, but they also are super sceptical about anything that smells remotely like bullshit.

One of the most wise insights in coding is the aphorism“beware the enthusiasm of the recently converted.” And I see that so much with AI. I’ve seen it with compilers, with IDEs, paradigms, and languages.

I’ve been experimenting a lot with AI, and I’ve found it fantastic for comprehending poor code written by others. I’ve also found it great for bouncing ideas. And the code it writes, beyond boiler plate, is hot garbage. It doesn’t properly reason, it can’t design architecture, it can’t write code that is comprehensible to other programmers, and treating it as a “black box to be manipulated by AI” just leads to dead ends that can’t be escaped, terrible decisions that will take huge amounts of expert coding time to undo, subtle bugs that AI can’t fix and are super hard to spot, and often you can’t understand their code enough to fix them, and security nightmares.

Testing is insufficient for good code. Humans write code in a way that is designed for general correctness. AI does not, at least not yet.

I do think these problems can be solved. I think we probably need automated reasoning systems, or else vastly improved LLMs that border on automated reasoning much like humans do. Could be a year. Could be a decade. But right now these tools don’t work well. Great for vibe coding, prototyping, analysis, review, bouncing ideas.

blibble•17m ago
people also said that selling jpegs of monkeys for millions of dollars was a pump and dump scam, and would collapse

they were right

kakapo5672•39m ago
Whenever someone tells me that AI is worthless, does nothing, scam/slop etc, I ask them about their own AI usage, and their general knowledge about what's going on.

Invariably they've never used AI, or at most very rarely. (If they used AI beyond that, this would be admission that it was useful at some level).

Therefore it's reasonable to assume that you are in that boat. Now that might not be true in your case, who knows, but it's definitely true on average.

LewisVerstappen•25m ago
because your "negative critique" is just idiotic and wrong
senordevnyc•38m ago
This comment is legitimately hilarious to me. I thought it was satire at first. The list of what has happened in this field in the last twelve months is staggering to me, while you write it off as essentially nothing.

Different strokes, but I’m getting so much more done and mostly enjoying it. Can’t wait to see what 2026 holds!

ronsor•30m ago
People who dislike LLMs are generally insistent that they're useless for everything and have infinitely negative value, regardless of facts they're presented with.

Anyone that believes that they are completely useless is just as deluded as anyone that believes they're going to bring an AGI utopia next week.

n2d4•36m ago
This is extremely dismissive. Claude Code helps me make a majority of changes to our codebase now, particularly small ones, and is an insane efficiency boost. You may not have the same experience for one reason or another, but plenty of devs do, so "nothing happened" is absolutely wrong.

2024 was a lot of talk, a lot of "AI could hypothetically do this and that". 2025 was the year where it genuinely started to enter people's workflows. Not everything we've been told would happen has happened (I still make my own presentations and write my own emails) but coding agents certainly have!

bandrami•31m ago
Did you ship more in 2025 than in 2024?
wickedsight•25m ago
I definitely did.
GCUMstlyHarmls•15m ago
Shipping in 2025: https://x.com/trq212/status/2001848726395269619
skydhash•12m ago
And this is one of the vague "AI helped me do more".

This is me touting for Emacs

Emacs was a great plus for me over the last year. The integration with various tooling with comint (REPL integration), compile (build or report tools), TUI (through eat or ansi-term), gave me a unified experience through the buffer paradigm of emacs. Using the same set of commands boosted my editing process and the easy addition of new commands make it easy to fit my development workflow to the editor.

This is how easy it is to write a non-vague "tool X helped me" and I'm not even an English native speaker.

castwide•1h ago
2025: The Year in LLMs

I will never stop treating hallucinations as inventions. I dare you to stop me. i double dog dare y

aussieguy1234•1h ago
> The year of YOLO and the Normalization of Deviance #

On this including AI agents deleting home folders, I was able to run agents in Firejail by isolating vscode (Most of my agents are vscode based ones, like Kilo Code).

I wrote a little guide on how I did it https://softwareengineeringstandard.com/2025/12/15/ai-agents...

Took a bit of tweaking, vscode crashing a bunch of times with not being able to read its config files, but I got there in the end. Now it can only write to my projects folder. All of my projects are backed up in git.

agentifysh•1h ago
What an amazing progress in just short time. The future is bright! Happy New Year y'all!
sho_hn•1h ago
Not in this review: Also the record year in intelligent systems aiding in and prompting human users into fatal self-harm.

Will 2026 fare better?

simonw•1h ago
I really hope so.

The big labs are (mostly) investing a lot of resources into reducing the chance their models will trigger self-harm and AI psychosis and suchlike. See the GPT-4o retirement (and resulting backlash) for an example of that.

But the number of users is exploding too. If they make things 5x less likely to happen but sign up 10x more people it won't be good on that front.

measurablefunc•1h ago
The people working on this stuff have convinced themselves they're on a religious quest so it's not going to get better: https://x.com/RobertFreundLaw/status/2006111090539687956
andai•1h ago
Also essential self-fulfilment.

But that one doesn't make headlines ;)

sho_hn•1h ago
Sure -- but that's fair game in engineering. I work on cars. If we kill people with safety faults I expect it to make more headlines than all the fun roadtrips.

What I find interesting with chat bots is that they're "web apps" so to speak, but with safety engineering aspects that type of developer is typically not exposed to or familiar with.

simonw•1h ago
One of the tough problems here is privacy. AI labs really don't want to be in the habit of actively monitoring people's conversations with their bots, but they also need to prevent bad situations from arising and getting worse.
walt_grata•39m ago
Until AI labs have the equivalent of an SLA for giving accurate and helpful responses it don't get better. They've not even able to measure if the agents work correctly and consistently.
websiteapi•1h ago
I'm curious how all of the progress will be seen if it does indeed result in mass unemployment (but not eradication) of professional software engineers.
simonw•1h ago
I nearly added a section about that. I wanted to contrast the thing where many companies are reducing junior engineering hires with the thing where Cloudflare and Shopify are hiring 1,000+ interns. I ran out of time and hadn't figured out a good way to frame it though so I dropped it.
ori_b•56m ago
My prediction: If we can successfully get rid of most software engineers, we can get rid of most knowledge work. Given the state of robotics, manual labor is likely to outlive intellectual labor.
beardedwizard•35m ago
"Given the state of robotics" reminds me a lot of what was said about llms and image/video models over the past 3 years. Considering how much llms improved, how long can robotics be in this state?

I have to think 3 years from now we will be having the same conversation about robots doing real physical labor.

"This is the worst they will ever be" feels more apt.

DrewADesign•43m ago
You’re absolutely right! You astutely observed that 2025 was a year with many LLMs and this was a selection of waypoints, summarized in a helpful timeline.

That’s what most non-tech-person’s year in LLMs looked like.

Hopefully 2026 will be the year where companies realize that implementing intrusive chatbots can’t make better ::waving hands:: ya know… UX or whatever.

For some reason, they think its helpful to distractingly pop up chat windows on their site because their customers need textual kindergarten handholding to … I don’t know… find the ideal pocket comb for their unique pocket/hair situation, or had an unlikely question about that aerosol pan release spray that a chatbot could actually answer. Well, my dog also thinks she’s helping me by attacking the vacuum when I’m trying to clean. Both ideas are equally valid.

And spending a bazillion dollars implementing it doesn’t mean your customers won’t hate it. And forcing your customers into pathways they hate because of your sunk costs mindset means it will never stop costing you more money than it makes.

I just hope companies start being honest with themselves about whether or not these things are good, bad, or absolutely abysmal for the customer experience and cut their losses when it makes sense.

Night_Thastus•35m ago
They need to be intrusive and shoved in your face. This way, they can say they have a lot of people using them, which is a good and useful metric.
ronsor•31m ago
> For some reason, they think its helpful to distractingly pop up chat windows on their site...

Companies have been doing this "live support" nonsense far longer than LLMs have been popular.

techpression•29m ago
Nothing about the severe impact on the environment, and the hand waviness about water usage hurt to read. The referenced post was missing every single point about the issue by making it global instead of local. And as if data center buildouts are properly planned and dimensioned for existing infrastructure…

Add to this that all the hardware is already old and the amount of waste we’re producing right now is mind boggling, and for what, fun tools for the use of one?

I don’t live in the US, but the amount of tax money being siphoned to a few tech bros should have heads rolling and I really don’t want to see it happening in Europe.

But I guess we got a new version number on a few models and some blown up benchmarks so that’s good, oh and of course the svg images we will never use for anything.

simonw•21m ago
"Nothing about the severe impact on the environment"

I literally said:

"AI data centers continue to burn vast amounts of energy and the arms race to build them continues to accelerate in a way that feels unsustainable."

AND I linked to my coverage from last year, which is still true today (hence why I felt no need to update it): https://simonwillison.net/2024/Dec/31/llms-in-2024/#the-envi...

smileson2•24m ago
forgot to mention the first murder-suicide instigated by chatgpt
didip•12m ago
Indeed. I don't understand why Hacker News is so dismissive about the coming of LLMs, maybe HN readers are going through 5 stages of grief?

But LLM is certainly a game changer, I can see it delivering impact bigger than the internet itself. Both require a lot of investments.

cebert•8m ago
Many people feel threatened by the rapid advancements in LLMs, fearing that their skills may become obsolete, and in turn act irrationally. To navigate this change effectively, we must keep open minds, keep adaptable, and embrace continuous learning. It took a long time for folks to accept that climate change was real too.
syndacks•4m ago
I can’t get over the range of sentiment on LLMs. HN leans snake oil, X leans “we’re all cooked” —- can it possibly be both? How do other folks make sense of this? I’m not asking for a side, rather understanding the range. Does the range lead you to believe X over Y? Are all new technologies so polarizing?