frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Meta Superintelligence Labs Presents: Compute as Teacher

https://twitter.com/DulhanJay/status/1968693170264248532
4•shash42•1h ago

Comments

shash42•1h ago
Where do learning signals come from when there is no ground truth in post-training?

New paper shows how to convert inference-time compute into high quality supervision for RL training.

Up to 30% rel. improvement on a realistic non-verifiable tasks (HealthBench), with the models own self-synthesised rubrics!

NitpickLawyer•1h ago
Paper link: https://arxiv.org/abs/2509.14234

Some interesting tidbits.

- they propose several "judges", each with their own model (weights at different stages) and separate "concerns". The generate part evolves with the model (in RL) while the "gather and reconcile" is fixed at a frozen stage.

- the "gather and reconcile" judge doesn't get the question when analysing the entire rollout set! (I hope I read this correctly "We keep the anchor question-blind to prevent it from acting as just another rollout and to encourage genuine cross-rollout reasoning")

- a 2nd judge "marks" binary yes/no self-proposed (by the evolved model) rubrics. This could translate in the evolved model having a harder time to "hack the rewards", since they come from basically 3 places - the evolved model via rollouts and proposed rubrics, the reconciliation by the frozen policy and by a 3rd party judge that only binary scores the rubrics. Very interesting, and actually huge if it works as proposed and scales w/ model size.

- beats maj@x by 14%, which is nice. Interesting that there's 1% (maybe too small to be relevant? no idea) where the final architecture answered correctly even if all the rollouts were wrong. Probably needs more investigation to make sure something didn't leak somewhere.

Personal thoughts:

- the models used are small (4,4,8B). We'll see if this scales w/ model size. It should, since GRPO does, but there's still a question on what 3rd party judge you use. Maybe an "adversarial" one like in GAN? Interesting avenues nonetheless.

First Ultrasonic Chef's Knife Vibrates 40,000X/Second for Easy Cutting

https://www.cnet.com/home/kitchen-and-household/worlds-first-ultrasonic-chefs-knife-vibrates-4000...
1•randfish•28s ago•0 comments

100k journalists to pitch and get published

https://journalisthunt.com
1•educated_panda•1m ago•0 comments

Show HN: Vicoa – Code with Claude and Codex Anywhere (Laptop + Mobile + Tablet)

https://vibecodeanywhere.com
1•nicktay•1m ago•0 comments

Discarded Small-Logs Recovery from Natural Forests: Improving the Value Chain

https://www.mdpi.com/1999-4907/16/9/1456
1•PaulHoule•1m ago•0 comments

ICE report finds 60 violations in 50 days at Fort Bliss migrant facility

https://www.elpasotimes.com/story/news/immigration/2025/09/17/ice-finds-60-violations-at-fort-bli...
1•perihelions•2m ago•0 comments

Vibe Coding: Citizen Development in its purest form

https://blog.bettyblocks.com/vibe-coding-citizen-development-in-its-purest-form
1•mooreds•2m ago•0 comments

Trump's Golden Dome will cost 10 to 100 times more than the Manhattan Project

https://arstechnica.com/space/2025/09/trumps-golden-dome-will-cost-10-to-100-times-more-than-the-...
6•voxadam•8m ago•1 comments

eBPF-InXpect: Lightweight XDP Profiling

https://github.com/VladimiroPaschali/eBPF-InXpect
1•tanelpoder•9m ago•1 comments

Struggling to find the right people to grow your startup?

1•Heysonics•11m ago•0 comments

Show HN: I Parallelized RNN Training from O(T) to O(log T) Using CUDA

https://dhruvmsheth.github.io/projects/gpu_pogramming_curnn/
1•omegablues•12m ago•0 comments

Show HN: Building an AI-native mini-OS for developers

https://vibemind.space/
1•stephbeaugoss•12m ago•1 comments

Ardent: Python package for fast dynamical detection limits w. radial velocities

https://arxiv.org/abs/2509.13521
1•BruceEel•12m ago•1 comments

ChickadeeOS, a teaching operating system for Harvard's CS 161

https://github.com/CS161/chickadee
1•ekzhang•13m ago•0 comments

Rediscovery

https://m15y.com/posts/derive
1•marissamary•13m ago•0 comments

Configuration files are user interfaces

https://ochagavia.nl/blog/configuration-files-are-user-interfaces/
7•todsacerdoti•15m ago•0 comments

Show HN: Quarkkit, Django SaaS boilerplate optimized for AI coding

https://quarkkit.com
1•jancek•16m ago•0 comments

GWSC Three Factor Authentication RFC (Draft-GWC-27001-3A)

https://gwsc-3fa.org
1•gjsman-1000•16m ago•0 comments

Why do some gamers invert their controls?

https://www.theguardian.com/games/2025/sep/18/why-do-some-gamers-invert-their-controls-scientists...
2•bookofjoe•18m ago•1 comments

What I learned building a programming language with LLM agents

https://eddmann.com/posts/santa-lang-workshop-exploring-agentic-llm-workflows-for-language-implem...
1•edd_mann•18m ago•0 comments

Salt can turn frozen water into a weak power source

https://www.sciencenews.org/article/saltwater-power-source
1•gmays•19m ago•0 comments

Mark Zuckerberg's smart glasses demo goes wrong

https://www.telegraph.co.uk/business/2025/09/18/mark-zuckerbergs-smart-glasses-demo-disrupted-gli...
2•mot2ba•19m ago•0 comments

Building tenets: Intelligent context aggregation for AI pair programming

https://jddunn.github.io/projects/tenets/
1•johnnyfived•19m ago•0 comments

Satya Nadella is haunted at the prospect of Microsoft not surviving the AI era

https://www.theverge.com/tech/780946/microsoft-satya-nadella-town-hall-comments-ai-era-notepad
2•jbernardo95•20m ago•0 comments

Help Us Raise $200k to Free the JavaScript from Oracle

https://deno.com/blog/javascript-tm-gofundme
1•mikece•21m ago•2 comments

Show HN: Playing Doom Using a Phone Call

https://github.com/Jitendra300/playing_games_on_phone_dial
1•Jitendra2333•22m ago•0 comments

Updates to the pf packet filter in FreeBSD and pfSense software

https://www.netgate.com/blog/updates-to-the-pf-packet-filter-in-freebsd-and-pfsense-software
2•currysausage•23m ago•0 comments

Amazon violated online shopper protection law: judge ahead of Prime signup trial

https://www.reuters.com/sustainability/amazon-violated-online-shopper-protection-law-judge-rules-...
2•giuliomagnifico•23m ago•0 comments

Show HN: dk, a Windows-friendly, Nix-like build system

https://github.com/diskuv/dk
1•beckford•23m ago•0 comments

Fifty Years After History's Most Brutal Boxing Match

https://www.theatlantic.com/magazine/archive/2025/10/ali-frazier-thrilla-in-manila-history/683972/
2•petethomas•24m ago•0 comments

Systematic Analysis of Kernel Security Performance and Energy Costs

https://dl.acm.org/doi/full/10.1145/3708821.3736197
2•tanelpoder•24m ago•0 comments