frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Tilth v0.3 – 17% cheaper AI code navigation (279 runs, 3 Claude models)

2•jahala•2h ago
tilth gives AI agents structural code intelligence (tree-sitter definitions, callee resolution, smart outlining) via MCP. I benchmarked it on 21 code navigation tasks across 4 real repos (Express, FastAPI, Gin, ripgrep).

-> https://github.com/jahala/tilth

Results: Sonnet 4.5 — 26% cheaper per correct answer (79% → 86% accuracy). Opus 4.6 — 14% cheaper (and the only model+mode combo to crack the hardest task). Haiku 4.5 — 82% cheaper when forced to use tilth (69% → 100% accuracy at $0.04/answer).

We measure “cost per correct answer” — what you’d expect to spend before getting a usable answer under retry. A wrong answer isn’t a cheap success.

Interesting finding: smarter models adopt MCP tools voluntarily (Sonnet 95%, Opus 94%), but Haiku ignores them (9%). Instruction tuning didn’t help. Removing the overlapping built-in tools did.

https://github.com/jahala/tilth/blob/main/benchmark/README.m...

PS: I dont have the budget to run the benchmark a lot with Opus, so if any token whales has capacity to run some benchmarks, please feel free to PR results.

Show HN: Maravel-Framework 10.62.8 speeds up the console via commands:cache

https://marius-ciclistu.medium.com/maravel-framework-10-62-8-speeds-up-the-console-via-commands-c...
1•marius-ciclistu•4m ago•0 comments

My Nanbeige4.1 3B chat room can now generate micro applications [video]

https://www.youtube.com/watch?v=WvT5cp6Za24
1•ToJans•5m ago•0 comments

Underrated Music Software – Royalty-Free

https://midigen.app/
1•thriftman•5m ago•0 comments

Dune II written in HTML5/JS

https://github.com/oklemenz/Dune2JS
1•reconnecting•8m ago•0 comments

Show HN: Crypthold – Deterministic, Tamper-Evident Secure State Engine

https://github.com/laphilosophia/crypthold
1•laphilosophia•8m ago•0 comments

Language models imply world models

https://blog.plover.com/tech/gpt/micro-worlds-2.html
1•gbacon•9m ago•0 comments

Echoed.gg – Discord Alternative

https://echoed.gg/
1•shaongitbd•9m ago•0 comments

GLM-5 topped the coding benchmarks. Then I used it

https://charlesazam.com/blog/glm5-benchmark-reality/
2•couAUIA•10m ago•1 comments

Show HN: PrivateWhisper – Run Whisper locally on macOS (offline transcription)

https://privatewhisper.app/
1•matyashajek•11m ago•1 comments

A minimal terminal coding agent harness

https://pi.dev/
1•thomascountz•11m ago•0 comments

It Isn't the Tool, but the Hands – A Response to "Something Big Is Happening"

1•markferraz•18m ago•0 comments

Dbt-Workbench, an open-source UI for working with dbt projects

https://github.com/rezer-bleede/dbt-Workbench
1•remisharoon•18m ago•1 comments

Show HN: PolyMCP – A framework for building and orchestrating MCP agents

2•justvugg•20m ago•1 comments

Dao Heart 3.11 Identity Preserving Value Evolution for Frontier AI Systems

https://github.com/Mankirat47/Dao-Heart_3.1
1•Mankirat47•21m ago•1 comments

Backboard.io Becomes First AI Platform to Lead Both Major Memory Benchmarks

https://backboard.io/changelog/backboard.io-becomes-first-ai-platform-to-lead-both-major-memory-b...
1•robimbeault•23m ago•2 comments

Show HN: An automaton's code review of Gas Town with sycophancy-mode disabled

2•burnerToBetOut•25m ago•0 comments

'RageCheck' Points Out Manipulative Language in News Articles

https://lifehacker.com/tech/ragecheck-manipulative-language-news-articles
1•gnabgib•26m ago•0 comments

Ask HN: Hacker News Fixed Width for Widescreen Monitors" Userstyle?

1•MollyRealized•27m ago•0 comments

Extend Trust Across the Software Supply Chain with Red Hat Trusted Libraries

https://www.redhat.com/en/blog/extend-trust-across-software-supply-chain-red-hat-trusted-libraries
1•jruohonen•29m ago•1 comments

CIA, Pentagon reviewed secret 'Havana syndrome' device in Norway, WaPo reports

https://www.reuters.com/business/healthcare-pharmaceuticals/cia-pentagon-reviewed-secret-havana-s...
1•alephnerd•32m ago•0 comments

I Analyzed 227M Rows of Medicaid Data. Here's a Sample of What I Found in Maine

https://twitter.com/lukethomas14/status/2022519245553160237
2•NewCzech•32m ago•0 comments

AI: A Bridge Toward Diverse Intelligence

https://www.noemamag.com/ai-could-be-a-bridge-toward-diverse-intelligence/
1•kjhughes•32m ago•0 comments

How to Write Mathematical Papers by Bruce C. Berndt [pdf]

https://alozano.clas.uconn.edu/wp-content/uploads/sites/490/2020/08/berndt.pdf
1•paulpauper•33m ago•0 comments

Curosr: Expanding our long-running agents research preview

https://cursor.com/blog/long-running-agents
3•mustaphah•33m ago•0 comments

Show HN: Cappu – ADHD'er take on a different task manager

https://cappu.app/
1•arajnoha•33m ago•0 comments

PlantNet; Identify, explore and share your observations of wild plants

https://identify.plantnet.org
2•thunderbong•36m ago•0 comments

Jeffrey Epstein spent years building ties to well-known hackers: Politico

https://www.politico.com/news/2026/02/14/epsteins-hackers-defcon-black-hat-00779365
2•star-glider•36m ago•0 comments

Show HN: Logbooks, notebook computing for coding agents

https://github.com/rwhaling/logbooks
2•rwhaling•36m ago•0 comments

Wazir Drop: a tournament winning board game AI engine

https://github.com/tczajka/wazir-drop
1•stared•37m ago•1 comments

Siri, Alexa, ChatGPT, and OpenClaw: What's Different?

https://openclaw.rocks/blog/openclaw-vs-siri-alexa-chatgpt
1•stubbi•38m ago•0 comments