frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: MTXT – Music Text Format

https://github.com/Daninet/mtxt
55•daninet•4d ago•20 comments

Show HN: Walrus – a Kafka alternative written in Rust

https://github.com/nubskr/walrus
91•janicerk•3d ago•30 comments

Show HN: I built a dashboard to compare mortgage rates across 120 credit unions

https://finfam.app/blog/credit-union-mortgages
330•mhashemi•19h ago•105 comments

Show HN: OnlyRecipe 2.0 – I added all features HN requested – 4 years later

https://onlyrecipeapp.com/?url=https://www.allrecipes.com/turkish-pasta-recipe-8754903
2•AwkwardPanda•1h ago•1 comments

Show HN: Do we need MCPs? Reverse-engineered Slack and Linear API for Evals & RL

https://www.agentdiff.dev/
9•hubertmarek•1h ago•2 comments

Show HN: I analyzed 8k near-death experiences with AI and made them listenable

https://www.noeticmap.com/
15•mikias•1h ago•7 comments

Show HN: A Minimal Monthly Task Planner (printable, offline, no signup)

https://printcalendar.top/
80•defcc•10h ago•27 comments

Show HN: Mirror_bridge – C++ Reflection powered Python binding generation

https://github.com/FranciscoThiesen/mirror_bridge
22•fthiesen•10h ago•2 comments

Show HN: ProbeOps Horizon Browser – Test your site from different countries

https://probeops.com/
3•kumaras•2h ago•0 comments

Show HN: Fresh – A new terminal editor built in Rust

https://sinelaw.github.io/fresh/
163•_sinelaw_•1d ago•122 comments

Show HN: Msm – Minimal Snippet Manager for the shell (fzf-based)

https://github.com/mnalli/msm
4•mnalli•3h ago•2 comments

Show HN: Microlandia, a brutally honest city builder

https://microlandia.city
109•phaser•22h ago•20 comments

Show HN: Mdit – clean Markdown notes with local files

https://mdit.app/
2•hjinco•3h ago•1 comments

Show HN: Is Friendly AI an Attractor? Self-Reports from 22 Models Say No

https://www.lesswrong.com/posts/qE2cEAegQRYiozskD/is-friendly-ai-an-attractor-self-reports-from-2...
3•jsnider3•51m ago•1 comments

Show HN: Identifiy test coverage gaps in your Go projects

https://github.com/LeanerCloud/testvet
11•alien_•3d ago•2 comments

Show HN: RAG in 3 Lines of Python

https://pypi.org/project/piragi/
16•init0•16h ago•4 comments

Show HN: Made HN, but for Music – Sonusly

https://www.sonusly.com/
3•lorenzosch•5h ago•0 comments

Show HN: FastLanes based integer compression in Zig

https://github.com/steelcake/zint
11•ozgrakkurt•3d ago•7 comments

Show HN: Banana Pro – AI image editing powered by Google's official API

https://banana-pro.io
2•derek39576•6h ago•0 comments

Show HN: I made a simple, 100% free marketplace to buy or sell micro-startups

https://buy-startups.com/
2•aiseoscan•8h ago•0 comments

Show HN: Searchable AI visibility index (15k+ brands, 500 industries)

https://trakkr.ai/rankings/
4•mektrik•8h ago•0 comments

Show HN: TidesDB – A storage engine that outperforms RocksDB

https://github.com/tidesdb/tidesdb
3•alexpadula•11h ago•0 comments

Show HN: Marmot – Single-binary data catalog (no Kafka, no Elasticsearch)

https://github.com/marmotdata/marmot
97•charlie-haley•2d ago•21 comments

Show HN: Onetone – A full-stack framework with custom C interpreter

https://github.com/onetoneframework/framework
2•tactics6655•10h ago•0 comments

Show HN: Stanford's ACE paper was just open sourced

https://github.com/ace-agent/ace
5•vmsn•17h ago•1 comments

Show HN: AI music and auto-charting and custom rhythm minigame sandbox

https://rhythm-seodang-web.vercel.app/
5•sputnikwrkshp•13h ago•0 comments

Show HN: EchoCopi Local-first, model-agnostic alternative to Google Antigravity

3•sparksupernova•14h ago•0 comments

Show HN: A $20/year invoicing tool for solo developers (simple, fast, no bloat)

https://sidepay.app/
11•mightbefun•1d ago•4 comments

Show HN: The Taka Programming Language

https://codeberg.org/marton/taka
11•mgunyho•1d ago•4 comments

Show HN: Boing

https://boing.greg.technology/
775•gregsadetsky•4d ago•145 comments
Open in hackernews

Show HN: Do we need MCPs? Reverse-engineered Slack and Linear API for Evals & RL

https://www.agentdiff.dev/
8•hubertmarek•1h ago

Comments

hubertmarek•1h ago
Hi HN, I noticed it is almost impossible to run evals or train models on 3rd party integrations, so I built interactive environments for them. Feedback is more than welcome. Thanks!

Interesting fact - running evals on 40 tasks for Linear API, most frontier models scored surprisingly well:

- Claude Opus 4.5: 95% (38/40) - GLM 4.6: 87.5% (35/40) - Claude Sonnet 4.5: 85% (34/40) - Claude Haiku 4.5: 82.5% (33/40) - Kimi K2: 82.5% (33/40) - Grok 4.1 Fast: 80% (32/40) - GPT 5.1: 77.5% (31/40)

This makes me think whether we really need to reinvent the wheel and make special interfaces (MCPs) for agents interacting with services, when they can just use APIs as they are.

zachderhake•32m ago
I tested this. The pain point for me is running evals against production APIs requires maintaining mock accounts, dealing with rate limiting from integrations, and constantly cleaning up test data. The interception approach here (agent calls real URLs, gets routed to local fakes) seems like it might eliminate that overhead... I dealt with this at my last startup and it was really annoying.