frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

I replaced Animal Crossing's dialogue with a live LLM by hacking GameCube memory

https://joshfonseca.com/blogs/animal-crossing-llm
482•vuciv•8h ago•106 comments

Supabase OrioleDB Patent: now freely available to the Postgres community

https://supabase.com/blog/orioledb-patent-free
10•tosh•15m ago•1 comments

iPhone Air

https://www.apple.com/newsroom/2025/09/introducing-iphone-air-a-powerful-new-iphone-with-a-breakt...
761•excerionsforte•17h ago•1565 comments

PKM apps need to get better at resurfacing information

https://ankursethi.com/blog/pkm-apps-need-to-get-better-at-resurfacing-information/
7•GeneralMaximus•3d ago•1 comments

Knowledge and Memory

https://www.robinsloan.com/lab/knowledge-and-memory/
26•zdw•3d ago•10 comments

E-paper display reaches the realm of LCD screens

https://spectrum.ieee.org/e-paper-display-modos
457•rbanffy•17h ago•136 comments

NASA finds Titan's lakes may be creating vesicles with primitive cell walls

https://www.sciencedaily.com/releases/2025/08/250831112449.htm
166•Gaishan•11h ago•34 comments

Claude now has access to a server-side container environment

https://www.anthropic.com/news/create-files
566•meetpateltech•21h ago•302 comments

Children and young people's reading in 2025

https://literacytrust.org.uk/research-services/research-reports/children-and-young-peoples-readin...
29•GeoAtreides•4h ago•12 comments

US High school students' scores fall in reading and math

https://apnews.com/article/naep-reading-math-scores-12th-grade-c18d6e3fbc125f12948cc70cb85a520a
382•bikenaga•21h ago•624 comments

We all dodged a bullet

https://xeiaso.net/notes/2025/we-dodged-a-bullet/
722•WhyNotHugo•20h ago•406 comments

All clickwheel iPod games have now been preserved for posterity

https://arstechnica.com/gaming/2025/09/all-54-lost-clickwheel-ipod-games-have-now-been-preserved-...
128•CharlesW•1d ago•33 comments

Axial twist theory

https://en.wikipedia.org/wiki/Axial_twist_theory
154•lordnacho•3d ago•36 comments

Made for People, Not Cars: Reclaiming European Cities

https://www.greeneuropeanjournal.eu/made-for-people-not-cars-reclaiming-european-cities/
6•robtherobber•1h ago•0 comments

R-Zero: Self-Evolving Reasoning LLM from Zero Data

https://arxiv.org/abs/2508.05004
53•lawrenceyan•9h ago•19 comments

Crimson (YC X25) is hiring founding engineers in London

https://www.ycombinator.com/companies/crimson/jobs/kCikzj1-founding-engineer-full-stack
1•markfeldner•4h ago

A love letter to the CSV format (2024)

https://medialab.sciencespo.fr/en/news/a-love-letter-to-the-csv-format/
55•jordigh•2h ago•57 comments

Semantic Line Breaks

https://sembr.org
24•Bogdanp•2d ago•19 comments

My Workflow Is 70% AI, 20% Copy-Paste, 10% Panic. What's Yours?

16•jamessmithe•1h ago•34 comments

YouTube is a mysterious monopoly

https://anderegg.ca/2025/09/08/youtube-is-a-mysterious-monopoly
262•geerlingguy•1d ago•338 comments

Hypervisor in 1k Lines

https://1000hv.seiya.me/en
86•lioeters•12h ago•7 comments

Memory Integrity Enforcement

https://security.apple.com/blog/memory-integrity-enforcement/
415•circuit•17h ago•196 comments

Show HN: Bottlefire – Build single-executable microVMs from Docker images

https://bottlefire.dev/
125•losfair•2d ago•15 comments

Tomorrow's emoji today: Unicode 17.0

https://jenniferdaniel.substack.com/p/tomorrows-emoji-today-unicode-170
165•ChrisArchitect•17h ago•272 comments

Interesting PEZY-SC4s

https://chipsandcheese.com/p/pezy-sc4s-at-hot-chips-2025
12•christkv•3d ago•1 comments

Building a DOOM-like multiplayer shooter in pure SQL

https://cedardb.com/blog/doomql/
197•lvogel•20h ago•34 comments

A new experimental Go API for JSON

https://go.dev/blog/jsonv2-exp
231•darccio•20h ago•81 comments

Immunotherapy drug clinical trial results: half of tumors shrink or disappear

https://www.rockefeller.edu/news/38120-immunotherapy-drug-eliminates-aggressive-cancers-in-clinic...
405•marc__1•14h ago•79 comments

An attacker’s blunder gave us a look into their operations

https://www.huntress.com/blog/rare-look-inside-attacker-operation
160•mellosouls•20h ago•93 comments

Microsoft is officially sending employees back to the office

https://www.businessinsider.com/microsoft-send-employees-back-to-office-rto-remote-work-2025-9
357•alloyed•19h ago•722 comments
Open in hackernews

R-Zero: Self-Evolving Reasoning LLM from Zero Data

https://arxiv.org/abs/2508.05004
53•lawrenceyan•9h ago

Comments

cyberge99•7h ago
What could go wrong?
magicalhippo•1h ago
Just don't hook it into the nuclear missile controls. We've seen[1] how that goes[2].

[1]: https://en.wikipedia.org/wiki/Colossus:_The_Forbin_Project

[2]: https://en.wikipedia.org/wiki/The_Terminator

koakuma-chan•1h ago
[3] https://en.wikipedia.org/wiki/Re:Zero
jasonjmcghee•6h ago
Conceptually, it's effectively a GAN
magicalhippo•1h ago
For those not in the know, that's Generative Adversarial Networks[1], where two neural networks are trained in a competitive way.

One network typically generates tasks for the other, and is rewarded if it manages to make the other network fail the task. The other network is rewarded if it successfully completes the task.

Thus the adversarial network tries to find weaknesses to exploit, and the combined training makes the solving network much stronger. Or at least that's the idea.

[1]: https://en.wikipedia.org/wiki/Generative_adversarial_network

torginus•1h ago
GAN's are a supervised training method, not really self-improving (after converging to being able to reproduce the training set).
frumiousirc•59m ago
My initial thought as well. But, what is the "Discriminator" here? What grounds the training toward reality? The "Challenger" and "Solver" adversity alone can only serve to amplify hallucination.

Ahh, GPT-4o is the arbiter.

So, basically, this is a way to perform LLM model compression (GPT-4o to qwen3) while maximizing the in-distribution domain size. As such, it seems reasonable and useful.

However the reliance on an arbiter LLM makes the claim that it will overcome the problem of a lack of training data unreasonable. Once the target LLM is scaled up to reach the in-distribution domain size of the arbiter, it seems to me it will turn back into a hallucination amplifier.

thom•3h ago
For values of zero quite far above zero.
falcor84•3h ago
What am I missing? From my skimming, there's zero external data beyond what is needed for the Challenger to generate questions.
thom•1h ago
An existing trained LLM is an enormous amount of 'data' however it might be encoded. AlphaZero didn't start with Stockfish or a database of games.
magicalhippo•1h ago
As I understand it the point of the article isn't to train a LLM from scratch, it's to teach a non-reasoning model to reason without additional explicit training data.
YeGoblynQueenne•1h ago
The abstract does use the term "from scratch":

>> To overcome this limitation, we introduce R-Zero, a fully autonomous framework that generates its own training data from scratch.

Giving the benefit of the doubt, they're just using it wrong, but the way they use it sure reads like they claim they found a way to initialise LLMs with 0 data. Only the absurdity of the claim protects the reader from such misunderstanding, and that's never a good thing in a research paper.

magicalhippo•45m ago
If you included the previous and following sentences, it's at least to me clear what they mean:

However, existing methods for training such models still rely heavily on vast human-curated tasks and labels, typically via fine-tuning or reinforcement learning, which poses a fundamental bottleneck to advancing AI systems toward capabilities beyond human intelligence

To overcome this limitation, we introduce R-Zero, a fully autonomous framework that generates its own training data from scratch.

Starting from a single base LLM, R-Zero initializes two independent models with distinct roles, a Challenger and a Solver.

Training a LLM is a multi-stage process[1], and they're tackling the stage at the end. That's where you do fine-tuning or reinforcement learning. They're not training a LLM from scratch. They're explicitly stating they start from a base LLM, ie a pretrained non-tuned model.

As I understand it, and as they mention, training data for the latter stages has typically required high-quality human-curated samples in large numbers, even if they're augmented using LLMs, say by generating multiple variations of each human-curated training sample.

Their proposal is to have a generative adversarial network generate that data without any initial human input, ie from scratch.

[1]: https://snorkel.ai/blog/large-language-model-training-three-...

tucnak•1h ago
AlphaZero is oftentimes dragged out to ridicule the so-called "self-play LLM training" techniques, although I don't think these arguments are terribly convincing. You can think of AlphaZero games as effectively synthetic data in adversarial setting; yes, it's easy to produce and verify as the rules of chess are verifiable, so it doesn't require much data on paper. This is not the case for most texts, with some notable exceptions in verifiable domains, where self-play is coincidentally applied most successfully. Thus, you could make an argument that the pre-existing "trained LLM" is merely functioning as a verifier proxy, analogous to the well-defined chess verifier in AlphaZero.
nakamoto_damacy•1h ago
Perpetual Motion Machines were a thing at some point, too.
YeGoblynQueenne•1h ago
Don't laugh. PMMs work! I built mine ten years ago when I realised I could improve the SOTA by a huge 20%. I've been improving it for the last 10 years and I get an average performance boost of ~0.25 every year. We will have Free Energy in the next 10 years.
api•1h ago
I refer to the endless self improving runaway AI as an “information theoretic perpetual motion machine.”

This will work in a sense. It will do… something… and learn… something. It will be unrelated to the physical universe in any way. See also: procedural landscape generators, etc.

K0balt•4m ago
Might kinda work if you have it tools to do its research on the open internet, fiverr, mechanical Turk, etc.
clbrmbr•26m ago
Terrible choice of name. DeepSeek developed a historically important model called “R-Zero” (this was the predecessor to R1 that was training without any coldstart SFT, and was very strong but difficult to read chain of thought because it code switches into Chinese and has no line breaks).