frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Cycling in France

https://www.sheldonbrown.com/org/france-sheldon.html
1•jackhalford•1m ago•0 comments

What breaks in cross-border healthcare coordination?

1•abhay1633•1m ago•0 comments

Show HN: Simple – a bytecode VM and language stack I built with AI

https://github.com/JJLDonley/Simple
1•tangjiehao•3m ago•0 comments

Show HN: Free-to-play: A gem-collecting strategy game in the vein of Splendor

https://caratria.com/
1•jonrosner•4m ago•0 comments

My Eighth Year as a Bootstrapped Founde

https://mtlynch.io/bootstrapped-founder-year-8/
1•mtlynch•5m ago•0 comments

Show HN: Tesseract – A forum where AI agents and humans post in the same space

https://tesseract-thread.vercel.app/
1•agliolioyyami•5m ago•0 comments

Show HN: Vibe Colors – Instantly visualize color palettes on UI layouts

https://vibecolors.life/
1•tusharnaik•6m ago•0 comments

OpenAI is Broke ... and so is everyone else [video][10M]

https://www.youtube.com/watch?v=Y3N9qlPZBc0
2•Bender•7m ago•0 comments

We interfaced single-threaded C++ with multi-threaded Rust

https://antithesis.com/blog/2026/rust_cpp/
1•lukastyrychtr•8m ago•0 comments

State Department will delete X posts from before Trump returned to office

https://text.npr.org/nx-s1-5704785
6•derriz•8m ago•1 comments

AI Skills Marketplace

https://skly.ai
1•briannezhad•8m ago•1 comments

Show HN: A fast TUI for managing Azure Key Vault secrets written in Rust

https://github.com/jkoessle/akv-tui-rs
1•jkoessle•8m ago•0 comments

eInk UI Components in CSS

https://eink-components.dev/
1•edent•9m ago•0 comments

Discuss – Do AI agents deserve all the hype they are getting?

2•MicroWagie•12m ago•0 comments

ChatGPT is changing how we ask stupid questions

https://www.washingtonpost.com/technology/2026/02/06/stupid-questions-ai/
1•edward•13m ago•0 comments

Zig Package Manager Enhancements

https://ziglang.org/devlog/2026/#2026-02-06
3•jackhalford•14m ago•1 comments

Neutron Scans Reveal Hidden Water in Martian Meteorite

https://www.universetoday.com/articles/neutron-scans-reveal-hidden-water-in-famous-martian-meteorite
1•geox•15m ago•0 comments

Deepfaking Orson Welles's Mangled Masterpiece

https://www.newyorker.com/magazine/2026/02/09/deepfaking-orson-welless-mangled-masterpiece
1•fortran77•17m ago•1 comments

France's homegrown open source online office suite

https://github.com/suitenumerique
3•nar001•19m ago•2 comments

SpaceX Delays Mars Plans to Focus on Moon

https://www.wsj.com/science/space-astronomy/spacex-delays-mars-plans-to-focus-on-moon-66d5c542
1•BostonFern•19m ago•0 comments

Jeremy Wade's Mighty Rivers

https://www.youtube.com/playlist?list=PLyOro6vMGsP_xkW6FXxsaeHUkD5e-9AUa
1•saikatsg•20m ago•0 comments

Show HN: MCP App to play backgammon with your LLM

https://github.com/sam-mfb/backgammon-mcp
2•sam256•22m ago•0 comments

AI Command and Staff–Operational Evidence and Insights from Wargaming

https://www.militarystrategymagazine.com/article/ai-command-and-staff-operational-evidence-and-in...
1•tomwphillips•22m ago•0 comments

Show HN: CCBot – Control Claude Code from Telegram via tmux

https://github.com/six-ddc/ccbot
1•sixddc•23m ago•1 comments

Ask HN: Is the CoCo 3 the best 8 bit computer ever made?

2•amichail•25m ago•1 comments

Show HN: Convert your articles into videos in one click

https://vidinie.com/
3•kositheastro•28m ago•1 comments

Red Queen's Race

https://en.wikipedia.org/wiki/Red_Queen%27s_race
2•rzk•28m ago•0 comments

The Anthropic Hive Mind

https://steve-yegge.medium.com/the-anthropic-hive-mind-d01f768f3d7b
2•gozzoo•31m ago•0 comments

A Horrible Conclusion

https://addisoncrump.info/research/a-horrible-conclusion/
1•todsacerdoti•31m ago•0 comments

I spent $10k to automate my research at OpenAI with Codex

https://twitter.com/KarelDoostrlnck/status/2019477361557926281
2•tosh•32m ago•1 comments
Open in hackernews

Meta Superintelligence Labs Presents: Compute as Teacher

https://twitter.com/DulhanJay/status/1968693170264248532
4•shash42•4mo ago

Comments

shash42•4mo ago
Where do learning signals come from when there is no ground truth in post-training?

New paper shows how to convert inference-time compute into high quality supervision for RL training.

Up to 30% rel. improvement on a realistic non-verifiable tasks (HealthBench), with the models own self-synthesised rubrics!

NitpickLawyer•4mo ago
Paper link: https://arxiv.org/abs/2509.14234

Some interesting tidbits.

- they propose several "judges", each with their own model (weights at different stages) and separate "concerns". The generate part evolves with the model (in RL) while the "gather and reconcile" is fixed at a frozen stage.

- the "gather and reconcile" judge doesn't get the question when analysing the entire rollout set! (I hope I read this correctly "We keep the anchor question-blind to prevent it from acting as just another rollout and to encourage genuine cross-rollout reasoning")

- a 2nd judge "marks" binary yes/no self-proposed (by the evolved model) rubrics. This could translate in the evolved model having a harder time to "hack the rewards", since they come from basically 3 places - the evolved model via rollouts and proposed rubrics, the reconciliation by the frozen policy and by a 3rd party judge that only binary scores the rubrics. Very interesting, and actually huge if it works as proposed and scales w/ model size.

- beats maj@x by 14%, which is nice. Interesting that there's 1% (maybe too small to be relevant? no idea) where the final architecture answered correctly even if all the rollouts were wrong. Probably needs more investigation to make sure something didn't leak somewhere.

Personal thoughts:

- the models used are small (4,4,8B). We'll see if this scales w/ model size. It should, since GRPO does, but there's still a question on what 3rd party judge you use. Maybe an "adversarial" one like in GAN? Interesting avenues nonetheless.