frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Expanding Racks [video]

https://www.youtube.com/watch?v=iWknov3Xpts
53•doctoboggan•2h ago•4 comments

Chatterbox TTS

https://github.com/resemble-ai/chatterbox
371•pinter69•10h ago•122 comments

Microsoft Office migration from Source Depot to Git

https://danielsada.tech/blog/carreer-part-7-how-office-moved-to-git-and-i-loved-devex/
111•dshacker•7h ago•92 comments

The hunt for Marie Curie's radioactive fingerprints in Paris

https://www.bbc.com/future/article/20250605-the-hunt-for-marie-curies-radioactive-fingerprints-in-paris
24•rmason•2d ago•1 comments

Show HN: Eyesite - experimental website combining computer vision and web design

https://blog.andykhau.com/blog/eyesite
60•akchro•6h ago•8 comments

AOSP project is coming to an end

https://old.reddit.com/r/StallmanWasRight/comments/1l8rhon/aosp_project_is_coming_to_an_end/
95•kaladin-jasnah•1h ago•21 comments

Research suggests Big Bang may have taken place inside a black hole

https://www.port.ac.uk/news-events-and-blogs/blogs/space-cosmology-and-the-universe/what-if-the-big-bang-wasnt-the-beginning-our-research-suggests-it-may-have-taken-place-inside-a-black-hole
467•zaik•11h ago•408 comments

Show HN: Spark, An advanced 3D Gaussian Splatting renderer for Three.js

https://sparkjs.dev/
270•dmarcos•14h ago•59 comments

Plants hear their pollinators, and produce sweet nectar in response

https://www.cbc.ca/listen/live-radio/1-51-quirks-and-quarks/clip/16150976-plants-hear-pollinators-produce-sweet-nectar-response
244•marojejian•4d ago•48 comments

How I Program with Agents

https://crawshaw.io/blog/programming-with-agents
436•bumbledraven•3d ago•241 comments

V-JEPA 2 world model and new benchmarks for physical reasoning

https://ai.meta.com/blog/v-jepa-2-world-model-benchmarks/
235•mfiguiere•16h ago•77 comments

How long it takes to know if a job is right for you or not

https://charity.wtf/2025/06/08/on-how-long-it-takes-to-know-if-a-job-is-right-for-you-or-not/
159•zdw•2d ago•99 comments

My Cord-Cutting Adventure

http://brander.ca/cordcut/
58•wizardforhire•3d ago•33 comments

Show HN: Ikuyo a Travel Planning Web Application

https://ikuyo.kenrick95.org/
257•kenrick95•18h ago•84 comments

Unveiling the EndBOX – A microcomputer prototype for EndBASIC

https://www.endbasic.dev/2025/06/unveiling-the-endbox.html
24•jmmv•7h ago•7 comments

OpenAI o3-pro

https://help.openai.com/en/articles/9624314-model-release-notes
224•mfiguiere•1d ago•119 comments

Bypassing GitHub Actions policies in the dumbest way possible

https://blog.yossarian.net/2025/06/11/github-actions-policies-dumb-bypass
185•woodruffw•17h ago•92 comments

Congratulations on creating the one billionth repository on GitHub

https://github.com/AasishPokhrel/shit/issues/1
475•petercooper•9h ago•108 comments

The curious case of shell commands, or how "this bug is required by POSIX" (2021)

https://notes.volution.ro/v1/2021/01/notes/502e747f/
117•wonger_•1d ago•69 comments

Show HN: RomM – An open-source, self-hosted ROM manager and player

https://github.com/rommapp/romm
189•gassi•16h ago•75 comments

Fine-tuning LLMs is a waste of time

https://codinginterviewsmadesimple.substack.com/p/fine-tuning-llms-is-a-huge-waste
124•j-wang•1d ago•55 comments

Show HN: S3mini – Tiny and fast S3-compatible client, no-deps, edge-ready

https://github.com/good-lly/s3mini
235•neon_me•22h ago•92 comments

TV Fool: See OTA channels you can receive

https://www.tvfool.com/index.php?option=com_wrapper&Itemid=29
15•nvahalik•4h ago•5 comments

Firefox OS's story from a Mozilla insider not working on the project (2024)

https://ludovic.hirlimann.net/2024/01/firefox-oss-story-from-mozila-insider.html
151•todsacerdoti•19h ago•96 comments

In case of emergency, break glass

https://morrick.me/archives/10048
3•microflash•2h ago•0 comments

Shaped (YC W22) Is Hiring

https://www.ycombinator.com/companies/shaped/jobs/qtQwxJO-head-of-engineering
1•tullie•10h ago

The Canadian C++ Conference

https://cppnorth.ca/index.html
21•BiraIgnacio•7h ago•5 comments

EchoLeak – 0-Click AI Vulnerability Enabling Data Exfiltration from 365 Copilot

https://www.aim.security/lp/aim-labs-echoleak-blogpost
196•pvg•12h ago•68 comments

DeskHog, an open-source developer toy

https://posthog.com/deskhog
199•constantinum•17h ago•80 comments

James Florio Turned Patrick Dougherty's Sculptures into Stellar Photography

https://aboutphotography.blog/blog/behind-the-scenes-with-phil-penman-the-making-of-new-york-street-diaries-book-spotlight
5•ChompChomp•3d ago•0 comments
Open in hackernews

Reinforcement Pre-Training

https://arxiv.org/abs/2506.08007
66•frozenseven•2d ago

Comments

hzia•1d ago
This is very exciting! Existing data will become a lot more valuable and it brings it one step closer to how we learn as humans!

The downside is that this is going to be extremely expensive, so the data set to conduct RL will need to be curated.

watsonmusic•1d ago
cannot wait seeing how it goes beyond the current llm training pipeline
nsagent•1d ago
It's clear that you're either one of the authors or a friend of theirs. You created this account 8 months ago to comment on another paper [1] that was released by the same authors.

[1]: https://news.ycombinator.com/item?id=41776324

dgshsg•1d ago
I notice that you can do this recursively to arbitrary depth. The cost is terrible though.
watsonmusic•1d ago
it could be adaptive. only high-value tokens were allocated with more compute
babelfish•1d ago
So marginally better (and occasionally worse) performance for an order of magnitude larger training costs…?
watsonmusic•1d ago
14b model performs comparably with 32b size. the improvement is huge
85392_school•1d ago
are we only comparing them in terms of text completion accuracy? does it also improve perf on benchmarks?
watsonmusic•1d ago
A new scaling paradigm finally comes out!
beauzero•1d ago
Interesting
NotAnOtter•1d ago
I'm interested how an innovation like this affects the business prospects.

Let's assume this is a paradigm shift on the scale of Transformers / `Attention is all you need`. Companies build out new models and pump another $100 Billion through it. And then a year from now, another innovation comes out. Same circus. And again.

No one wants to be left behind but trying to keep up will sink smaller companies.

curious_cat_163•1d ago
I am not sure why this ought to require "pump another $100 Billion". Could you elaborate?

Yes, the more recent generation of GPUs optimize for attention math. But they are still fairly "general-purpose" accelerators as well. So when I see papers like this (interesting idea, btw!), my mental model for costs suggests that the CapEx to buy up the GPUs and build out the data centers would get re-used for this and 100s of other ideas and experiments.

And then the hope is that the best ideas will occupy more of the available capacity...

gessha•1d ago
Sir, this is an arxiv paper
NotAnOtter•1d ago
So true, just like this one: https://arxiv.org/abs/1706.03762
Imnimo•1d ago
This is an interesting way of squeezing extra feedback from raw text, but I'm a little skeptical that it's the best way to spend training flops. It feels like most "next tokens" are pretty low information (even after filtering for entropy like they do). Does it make sense to spend a bunch of compute on a reasoning trace on them? Maybe if you're harshly data limited, but not compute limited?
rafaelero•1d ago
This should be used for high entropy tokens during pre-training.
ntonozzi•1d ago
Is there any work related to using some kind of soft tokens for reasoning? It seems so inefficient to try to encode so much information down into a single token for the next pass of the model, when you could output a large vector for each forward pass, and have a drastically larger working memory/scratchpad, and have much higher bandwidth for the models to pass information forward to the next token call. If a single token has 17 bits of information, a vector of 1024 floats could have 32,768 bits of information.
ntonozzi•13h ago
I just found a recent paper about this: https://arxiv.org/abs/2505.15778. It's really thoughtful and well written. They mix the different token outputs together.