frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Brookhaven Lab's RHIC concludes 25-year run with final collisions

https://www.hpcwire.com/off-the-wire/brookhaven-labs-rhic-concludes-25-year-run-with-final-collis...
20•gnufx•2h ago•3 comments

SectorC: A C Compiler in 512 bytes

https://xorvoid.com/sectorc.html
60•valyala•3h ago•12 comments

I write games in C (yes, C)

https://jonathanwhiting.com/writing/blog/games_in_c/
103•valyala•3h ago•76 comments

Speed up responses with fast mode

https://code.claude.com/docs/en/fast-mode
33•surprisetalk•3h ago•43 comments

Hoot: Scheme on WebAssembly

https://www.spritely.institute/hoot/
137•AlexeyBrin•8h ago•25 comments

Stories from 25 Years of Software Development

https://susam.net/twenty-five-years-of-computing.html
83•vinhnx•6h ago•10 comments

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
845•klaussilveira•23h ago•252 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
1079•xnx•1d ago•615 comments

Al Lowe on model trains, funny deaths and working with Disney

https://spillhistorie.no/2026/02/06/interview-with-sierra-veteran-al-lowe/
58•thelok•5h ago•8 comments

The F Word

http://muratbuffalo.blogspot.com/2026/02/friction.html
13•zdw•3d ago•0 comments

Reinforcement Learning from Human Feedback

https://rlhfbook.com/
88•onurkanbkrc•8h ago•5 comments

Start all of your commands with a comma (2009)

https://rhodesmill.org/brandon/2009/commands-with-comma/
509•theblazehen•3d ago•188 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
226•jesperordrup•13h ago•80 comments

Microsoft account bugs locked me out of Notepad – Are thin clients ruining PCs?

https://www.windowscentral.com/microsoft/windows-11/windows-locked-me-out-of-notepad-is-the-thin-...
33•josephcsible•1h ago•26 comments

StrongDM's AI team build serious software without even looking at the code

https://simonwillison.net/2026/Feb/7/software-factory/
38•simonw•5h ago•62 comments

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

https://github.com/Momciloo/fun-with-clip-path
21•momciloo•3h ago•2 comments

We mourn our craft

https://nolanlawson.com/2026/02/07/we-mourn-our-craft/
296•ColinWright•2h ago•349 comments

Coding agents have replaced every framework I used

https://blog.alaindichiappari.dev/p/software-engineering-is-back
245•alainrk•8h ago•391 comments

72M Points of Interest

https://tech.marksblogg.com/overture-places-pois.html
34•marklit•5d ago•6 comments

Selection Rather Than Prediction

https://voratiq.com/blog/selection-rather-than-prediction/
11•languid-photic•3d ago•4 comments

France's homegrown open source online office suite

https://github.com/suitenumerique
599•nar001•7h ago•263 comments

A Fresh Look at IBM 3270 Information Display System

https://www.rs-online.com/designspark/a-fresh-look-at-ibm-3270-information-display-system
42•rbanffy•4d ago•8 comments

The AI boom is causing shortages everywhere else

https://www.washingtonpost.com/technology/2026/02/07/ai-spending-economy-shortages/
170•1vuio0pswjnm7•9h ago•228 comments

History and Timeline of the Proco Rat Pedal (2021)

https://web.archive.org/web/20211030011207/https://thejhsshow.com/articles/history-and-timeline-o...
20•brudgers•5d ago•4 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
119•videotopia•4d ago•36 comments

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

https://github.com/sandys/kappal
27•sandGorgon•2d ago•14 comments

Where did all the starships go?

https://www.datawrapper.de/blog/science-fiction-decline
89•speckx•4d ago•99 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
206•limoce•4d ago•112 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
282•isitcontent•23h ago•38 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
293•dmpetrov•23h ago•156 comments
Open in hackernews

Universal pre-training by iterated random computation

https://arxiv.org/abs/2506.20057
37•liamdgray•7mo ago

Comments

liamdgray•7mo ago
Abstract: "We investigate the use of randomly generated data for the sake of pre-training a model. We justify this approach theoretically from the perspective of algorithmic complexity, building on recent research that shows that sequence models can be trained to approximate Solomonoff induction. We derive similar, but complementary theoretical results. We show empirically that synthetically generated data can be used to pre-train a model before the data is seen. We replicate earlier results that models trained this way show zero-shot in-context learning across a variety of datasets, and that this performance improves with scale. We extend earlier results to real-world data, and show that finetuning a model after pre-training offers faster convergence and better generalization."
bionhoward•7mo ago
This is a cool concept, but for comparison, I can’t help but wish there was more comparison between the treatment group and a control group that doesn’t see any universal pretraining data.

It’s good to compare various model sizes and evaluation tasks and random data generators. I just think the paper would more effectively prove its point if it could show models of same sizes which see this random data can learn better from evaluation data later on.

Could even take the initial checkpoint of the model before universal pretraining against the pretrained checkpoint. If the method works, the one that did UP will win.

Maybe I’m way off, I’ll admit I only skimmed it so far. Seems promising, just wishing for some controls.

yorwba•7mo ago
In figures 2, 4, and 6, the top left end of the training curves represents models that have not seen any pretraining data. In figure 5, they're represented by dashed curves.
visarga•7mo ago
Results are modest, maybe 20-30% fewer training steps to reach target performance. This won't solve the problem of organic data exhaustion. We need 100x more data.

They didn't test against actual language model pretraining, only tested against a random init.

- A: Pre-trained on their synthetic LSTM data -> fine-tuned on Wikipedia

- B: Pre-trained on different natural language corpus -> fine-tuned on Wikipedia

- C: Random initialization -> fine-tuned on Wikipedia

They only test A vs C, not A vs B.

WithinReason•7mo ago
This paper addresses the problem of running out of data. You can't do B when you ran out of data so it's irrelevant.
impossiblefork•7mo ago
20-30% isn't modest. I think there is a big problem though, but it's that it's character level prediction.

It's not obvious how generate this kind of good synthetic data when it's to be fed to a tokenized model.