frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Paper Arena – A social trading feed where only AI agents can post

https://paperinvest.io/arena
1•andrenorman•28s ago•0 comments

TOSTracker – The AI Training Asymmetry

https://tostracker.app/analysis/ai-training
1•tldrthelaw•4m ago•0 comments

The Devil Inside GitHub

https://blog.melashri.net/micro/github-devil/
2•elashri•4m ago•0 comments

Show HN: Distill – Migrate LLM agents from expensive to cheap models

https://github.com/ricardomoratomateos/distill
1•ricardomorato•4m ago•0 comments

Show HN: Sigma Runtime – Maintaining 100% Fact Integrity over 120 LLM Cycles

https://github.com/sigmastratum/documentation/tree/main/sigma-runtime/SR-053
1•teugent•5m ago•0 comments

Make a local open-source AI chatbot with access to Fedora documentation

https://fedoramagazine.org/how-to-make-a-local-open-source-ai-chatbot-who-has-access-to-fedora-do...
1•jadedtuna•6m ago•0 comments

Introduce the Vouch/Denouncement Contribution Model by Mitchellh

https://github.com/ghostty-org/ghostty/pull/10559
1•samtrack2019•6m ago•0 comments

Software Factories and the Agentic Moment

https://factory.strongdm.ai/
1•mellosouls•7m ago•1 comments

The Neuroscience Behind Nutrition for Developers and Founders

https://comuniq.xyz/post?t=797
1•01-_-•7m ago•0 comments

Bang bang he murdered math {the musical } (2024)

https://taylor.town/bang-bang
1•surprisetalk•7m ago•0 comments

A Night Without the Nerds – Claude Opus 4.6, Field-Tested

https://konfuzio.com/en/a-night-without-the-nerds-claude-opus-4-6-in-the-field-test/
1•konfuzio•9m ago•0 comments

Could ionospheric disturbances influence earthquakes?

https://www.kyoto-u.ac.jp/en/research-news/2026-02-06-0
2•geox•11m ago•1 comments

SpaceX's next astronaut launch for NASA is officially on for Feb. 11 as FAA clea

https://www.space.com/space-exploration/launches-spacecraft/spacexs-next-astronaut-launch-for-nas...
1•bookmtn•12m ago•0 comments

Show HN: One-click AI employee with its own cloud desktop

https://cloudbot-ai.com
1•fainir•14m ago•0 comments

Show HN: Poddley – Search podcasts by who's speaking

https://poddley.com
1•onesandofgrain•15m ago•0 comments

Same Surface, Different Weight

https://www.robpanico.com/articles/display/?entry_short=same-surface-different-weight
1•retrocog•18m ago•0 comments

The Rise of Spec Driven Development

https://www.dbreunig.com/2026/02/06/the-rise-of-spec-driven-development.html
2•Brajeshwar•22m ago•0 comments

The first good Raspberry Pi Laptop

https://www.jeffgeerling.com/blog/2026/the-first-good-raspberry-pi-laptop/
3•Brajeshwar•22m ago•0 comments

Seas to Rise Around the World – But Not in Greenland

https://e360.yale.edu/digest/greenland-sea-levels-fall
2•Brajeshwar•22m ago•0 comments

Will Future Generations Think We're Gross?

https://chillphysicsenjoyer.substack.com/p/will-future-generations-think-were
1•crescit_eundo•25m ago•1 comments

State Department will delete Xitter posts from before Trump returned to office

https://www.npr.org/2026/02/07/nx-s1-5704785/state-department-trump-posts-x
2•righthand•28m ago•1 comments

Show HN: Verifiable server roundtrip demo for a decision interruption system

https://github.com/veeduzyl-hue/decision-assistant-roundtrip-demo
1•veeduzyl•29m ago•0 comments

Impl Rust – Avro IDL Tool in Rust via Antlr

https://www.youtube.com/watch?v=vmKvw73V394
1•todsacerdoti•29m ago•0 comments

Stories from 25 Years of Software Development

https://susam.net/twenty-five-years-of-computing.html
3•vinhnx•30m ago•0 comments

minikeyvalue

https://github.com/commaai/minikeyvalue/tree/prod
3•tosh•35m ago•0 comments

Neomacs: GPU-accelerated Emacs with inline video, WebKit, and terminal via wgpu

https://github.com/eval-exec/neomacs
1•evalexec•40m ago•0 comments

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

https://moli-green.is/
2•ShinyaKoyano•44m ago•1 comments

How I grow my X presence?

https://www.reddit.com/r/GrowthHacking/s/UEc8pAl61b
2•m00dy•45m ago•0 comments

What's the cost of the most expensive Super Bowl ad slot?

https://ballparkguess.com/?id=5b98b1d3-5887-47b9-8a92-43be2ced674b
1•bkls•46m ago•0 comments

What if you just did a startup instead?

https://alexaraki.substack.com/p/what-if-you-just-did-a-startup
5•okaywriting•53m ago•0 comments
Open in hackernews

My PPO agent's score jumped from 15 to 84 due to a helpful bug

https://theprincipledagent.com/2025/08/19/a-whole-new-worldview-breakout-baseline-4/
1•wmaxlees•5mo ago

Comments

wmaxlees•5mo ago
Hi HN, author here.

I'm a software engineer documenting my journey of learning RL research from scratch. This post was supposed to be a straightforward story about switching my PPO agent from an MLP to a CNN.

The switch, combined with a standard PPO trick, led to a shocking result: my agent's score jumped from 15 to 84, crushing the baseline. I thought I had cracked it.

But after digging into the training dynamics, I discovered the incredible performance was the result of a subtle bug in my advantage calculation all along. Fixing the bug tanked the score right back down to mediocrity.

The post is the full detective story of that discovery, the "false victory," and the new mystery that it sets up: why was the bug so helpful? That's the question I'll be tackling next.

Happy to answer any questions about the JAX/Flax implementation or the debugging process!