frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Universal pre-training by iterated random computation

https://arxiv.org/abs/2506.20057
37•liamdgray•7mo ago

Comments

liamdgray•7mo ago
Abstract: "We investigate the use of randomly generated data for the sake of pre-training a model. We justify this approach theoretically from the perspective of algorithmic complexity, building on recent research that shows that sequence models can be trained to approximate Solomonoff induction. We derive similar, but complementary theoretical results. We show empirically that synthetically generated data can be used to pre-train a model before the data is seen. We replicate earlier results that models trained this way show zero-shot in-context learning across a variety of datasets, and that this performance improves with scale. We extend earlier results to real-world data, and show that finetuning a model after pre-training offers faster convergence and better generalization."
bionhoward•7mo ago
This is a cool concept, but for comparison, I can’t help but wish there was more comparison between the treatment group and a control group that doesn’t see any universal pretraining data.

It’s good to compare various model sizes and evaluation tasks and random data generators. I just think the paper would more effectively prove its point if it could show models of same sizes which see this random data can learn better from evaluation data later on.

Could even take the initial checkpoint of the model before universal pretraining against the pretrained checkpoint. If the method works, the one that did UP will win.

Maybe I’m way off, I’ll admit I only skimmed it so far. Seems promising, just wishing for some controls.

yorwba•7mo ago
In figures 2, 4, and 6, the top left end of the training curves represents models that have not seen any pretraining data. In figure 5, they're represented by dashed curves.
visarga•7mo ago
Results are modest, maybe 20-30% fewer training steps to reach target performance. This won't solve the problem of organic data exhaustion. We need 100x more data.

They didn't test against actual language model pretraining, only tested against a random init.

- A: Pre-trained on their synthetic LSTM data -> fine-tuned on Wikipedia

- B: Pre-trained on different natural language corpus -> fine-tuned on Wikipedia

- C: Random initialization -> fine-tuned on Wikipedia

They only test A vs C, not A vs B.

WithinReason•7mo ago
This paper addresses the problem of running out of data. You can't do B when you ran out of data so it's irrelevant.
impossiblefork•7mo ago
20-30% isn't modest. I think there is a big problem though, but it's that it's character level prediction.

It's not obvious how generate this kind of good synthetic data when it's to be fed to a tokenized model.

Cache Monet

https://cachemonet.com
1•keepamovin•25s ago•0 comments

Chinese Propaganda in Infomaniak's Euria, and a Reflection on Open Source AI

https://gagliardoni.net/#20260208_euria
1•tomgag•1m ago•1 comments

Show HN: A free, browser-only PDF tools collection built with Kimi k2.5

https://pdfuck.com
2•Justin3go•3m ago•0 comments

Curating a Show on My Ineffable Mother, Ursula K. Le Guin

https://hyperallergic.com/curating-a-show-on-my-ineffable-mother-ursula-k-le-guin/
2•bryanrasmussen•9m ago•0 comments

Show HN: HackerStack.dev – 49 Curated AI Tools for Indie Hackers

https://hackerstack.dev
1•pascalicchio•16m ago•0 comments

Pensions Are a Ponzi Scheme

https://poddley.com/?searchParams=segmentIds=b53ff41f-25c9-4f35-98d6-36616757d35b
1•onesandofgrain•22m ago•7 comments

Divvy.club – Splitwise alternative that makes sense

https://divvy.club
1•filepod•23m ago•0 comments

Betterment data breach exposes 1.4M customers

https://www.americanbanker.com/news/1-4-million-data-breach-betterment-shinyhunters-salesforce
1•NewCzech•23m ago•0 comments

MIT Technology Review has confirmed that posts on Moltbook were fake

https://www.technologyreview.com/2026/02/06/1132448/moltbook-was-peak-ai-theater/
2•helloplanets•23m ago•0 comments

Epstein Science: the people Epstein discussed scientific topics with

https://edge.dog/templates/cml9p8slu0009gdj2p0l8xf4r
2•castalian•24m ago•0 comments

Bambuddy – a free, self-hosted management system for Bambu Lab printers

https://bambuddy.cool
2•maziggy•28m ago•1 comments

Every Failed M4 Gun Replacement Attempt

https://www.youtube.com/watch?v=jrnAU67_EWg
3•tomaytotomato•29m ago•1 comments

China ramps up energy boom flagged by Musk as key to AI race

https://techxplore.com/news/2026-02-china-ramps-energy-boom-flagged.html
2•myk-e•29m ago•0 comments

Show HN: ClawBox – Dedicated OpenClaw Hardware (Jetson Orin Nano, 67 Tops, 20W)

https://openclawhardware.dev
2•superactro•32m ago•0 comments

Ask HN: AI never gets flustered, will that make us better as people or worse?

1•keepamovin•32m ago•0 comments

Show HN: HalalCodeCheck – Verify food ingredients offline

https://halalcodecheck.com/
3•pythonbase•34m ago•0 comments

Student makes cosmic dust in a lab, shining a light on the origin of life

https://www.cnn.com/2026/02/06/science/cosmic-dust-discovery-life-beginnings
1•Brajeshwar•37m ago•0 comments

In the Australian outback, we're listening for nuclear tests

https://www.abc.net.au/news/2026-02-08/australian-outback-nuclear-tests-listening-warramunga-faci...
6•defrost•37m ago•0 comments

'Hermès orange' iPhone sparks Apple comeback in China

https://www.ft.com/content/e2d78d04-7368-4b0c-abd5-591c03774c46
1•Brajeshwar•38m ago•0 comments

Show HN: Goxe 19k Logs/S on an I5

https://github.com/DumbNoxx/goxe
1•nxus_dev•39m ago•1 comments

The async builder pattern in Rust

https://blog.yoshuawuyts.com/async-finalizers/
2•fanf2•40m ago•0 comments

(Golang) Self referential functions and the design of options

https://commandcenter.blogspot.com/2014/01/self-referential-functions-and-design.html
1•hambes•40m ago•0 comments

Show HN: Model Training Memory Simulator

https://czheo.github.io/2026/02/08/model-training-memory-simulator/
1•czheo•43m ago•0 comments

Claude Code Controller

https://github.com/The-Vibe-Company/claude-code-controller
1•shidhincr•46m ago•0 comments

Software design is now cheap

https://dottedmag.net/blog/cheap-design/
1•dottedmag•47m ago•0 comments

Show HN: Are You Random? – A game that predicts your "random" choices

https://github.com/OvidijusParsiunas/are-you-random
1•ovisource•52m ago•1 comments

Poland to probe possible links between Epstein and Russia

https://www.reuters.com/world/poland-probe-possible-links-between-epstein-russia-pm-tusk-says-202...
2•doener•1h ago•0 comments

Effectiveness of AI detection tools in identifying AI-generated articles

https://www.ijoms.com/article/S0901-5027(26)00025-1/fulltext
3•XzetaU8•1h ago•0 comments

Warsaw Circle

https://wildtopology.com/bestiary/warsaw-circle/
2•hackandthink•1h ago•0 comments

Reverse Engineering Raiders of the Lost Ark for the Atari 2600

https://github.com/joshuanwalker/Raiders2600
2•pacod•1h ago•0 comments