news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

How to stop your AI agent from gaming its own KPI

https://sderosiaux.substack.com/p/how-to-stop-your-ai-agent-from-gaming

1•chtefi•1h ago

Comments

chtefi•1h ago

Mitigations mentioned:

- Make the validator read-only for the agent. Mount it as read-only in the container, or hash your eval scripts at startup and verify before each run. If the agent can write to anything in its evaluation path, it can (will) game it.

- Log the full trajectory, not just the output. Every tool call, file diff, reasoning step. Then run a second agent over the trace with no knowledge of the KPI: it only knows what honest execution looks like and will use its internal honest alignment to assess it.

- Write system prompts like job descriptions, not optimization targets. Name a reviewer. Give the agent permission to fail ("if you can't hit the target, explain why").

- Walk your own prompts: what's the metric, what can the agent write, and can it reach the metric by modifying the measurement instead of doing the work? If yes, close that path.

There's Hope That at Least Colorado's Age Attestation Bill Could Exclude OSS

https://www.phoronix.com/news/Colorado-Maybe-Exclude-OSS

1•Bender•2m ago•0 comments

Amazon could take at least a day to restore data centers hit by 'objects'

https://www.businessinsider.com/amazon-web-services-data-center-fire-objects-middle-east-strikes-...

1•rainhacker•2m ago•0 comments

Show HN: We're making an inventory autobattler game and released a WebGL demo

https://pixelnest.itch.io/cosmo-cargo

1•valryon•2m ago•0 comments

LiteParser: A fully featured embeddable SQLite parser

https://github.com/sqliteai/liteparser

1•marcobambini•2m ago•0 comments

Daily multivitamin slows signs of biological ageing

https://www.nature.com/articles/d41586-026-00741-3

2•Murfalo•4m ago•1 comments

Show HN: Analysis of 15 AI chat platforms: only 7 offer end-to-end encryption

https://github.com/daoistjc/ai-privacy-research

1•jonathananuma•5m ago•2 comments

Ask HN: How are you testing AI agents before shipping to production?

1•harperlabs•6m ago•0 comments

Show HN: CocoIndex-Code open source embedded AST-based code MCP

https://github.com/cocoindex-io/cocoindex-code

1•georgehe9•7m ago•0 comments

CI should fail on your machine first

https://blog.nix-ci.com/post/2026-03-09_ci-should-fail-on-your-machine-first

2•Norfair•7m ago•0 comments

Frequency – autonomous agent pipelines coordinating through shared state

https://www.frequency.sh/blog/introducing-frequency/

2•frequencyai•8m ago•1 comments

The Fallen Primes of Tohu

https://elonlit.com/scrivings/the-fallen-primes-of-tohu/

1•elonlit•9m ago•0 comments

Parseword – new game from the creator of Wordle

https://www.parseword.com

1•knuckleheads•10m ago•0 comments

Anthropic launches code review tool to check flood of AI-generated code

https://techcrunch.com/2026/03/09/anthropic-launches-code-review-tool-to-check-flood-of-ai-genera...

1•linsomniac•11m ago•0 comments

Internet Archive Faces Copyright Lawsuit over 'Myspace Dragon Hoard'

https://torrentfreak.com/internet-archive-faces-copyright-lawsuit-over-myspace-dragon-hoard/

1•crtasm•11m ago•0 comments

Ask HN: 1 Hash/Sec paced PoW making 51% attacks impossible – seeking engineers

1•HurairahShamsi•13m ago•0 comments

There are no heroes in commercial AI

https://garymarcus.substack.com/p/there-are-no-heroes-in-commercial

1•pretext•14m ago•0 comments

Seeking help spreading passion for CS with class on DIY Arduino game consoles

https://community.arduboy.com/t/looking-for-instructors-to-teach-intro-electronics-with-breadboar...

1•jaltekruse•15m ago•1 comments

New Patch Can Boost Linux ZRAM Compression Performance by over 50%

https://www.phoronix.com/news/Linux-ZRAM-50p-Compress-Boost

1•Bender•16m ago•0 comments

From games to biology and beyond: 10 years of AlphaGo's impact

https://deepmind.google/blog/10-years-of-alphago/

1•colesantiago•16m ago•0 comments

AI Assistants Are Moving the Security Goalposts

https://krebsonsecurity.com/2026/03/how-ai-assistants-are-moving-the-security-goalposts/

1•Bender•16m ago•0 comments

Afghanistan's Airspace Is Uncontrolled

https://twitter.com/i/status/2031075921763963341

1•marklit•16m ago•0 comments

Hackers Are Automating Cyberattacks with AI. Defenders Using It to Fight Back

https://singularityhub.com/2026/03/09/hackers-are-automating-cyberattacks-with-ai-defenders-are-u...

2•Brajeshwar•16m ago•1 comments

Feral cattle and horses shape vegetation structure in a trophic rewilding area

https://esajournals.onlinelibrary.wiley.com/doi/10.1002/eap.70170

1•PaulHoule•17m ago•0 comments

How fast does a protein fold? Real-time technique captures the moment

https://www.nature.com/articles/d41586-026-00755-x

3•Brajeshwar•17m ago•0 comments

MCP Roadmap

http://blog.modelcontextprotocol.io/posts/2026-mcp-roadmap/

1•pentagrama•17m ago•0 comments

We've only just confirmed that Homo habilis existed

https://www.newscientist.com/article/2518316-weve-only-just-confirmed-that-homo-habilis-really-ex...

3•Brajeshwar•17m ago•0 comments

Scientists reveal controversial plan to launch 50k MIRRORS into space

https://www.dailymail.co.uk/sciencetech/article-15631695/Scientists-plan-launch-MIRRORS-space.html

1•Bender•17m ago•1 comments

The indexing your database has is more important than many realize

https://faucetdb.ai/blog/mcp-database-benchmark/

1•guardian17•19m ago•1 comments

Instagram to End Support for End-to-End Encrypted Messaging After May 8, 2026

https://techlomedia.in/2026/03/instagram-to-end-support-for-end-to-end-encrypted-messaging-after-...

2•deepanker70•20m ago•1 comments

Ask HN: How do you block/hide comments with links to grokipedia, on HN?

1•netfortius•20m ago•0 comments