frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Startup's new mechanistic interpretability tool lets you debug LLMs

https://www.technologyreview.com/2026/04/30/1136721/this-startups-new-mechanistic-interpretability-tool-lets-you-debug-llms/
2•joozio•1h ago

Comments

mdp2021•1h ago
The urgency of reliable automated reasoning over natural language is more and more and more urgent. The LLMs have started the timer (or countdown), collective reality assessment makes the urgency obvious. From the article, for example:

> For example, many models will tell you that 9.11 is greater than 9.9. Looking inside a model to see what’s going on might reveal that it is being influenced by neurons associated with the Bible, in which verse 9.9 comes before 9.11, or by code repositories where consecutive updates are numbered 9.9, 9.10, 9.11 and so on. Using this information, the model can be retrained to make it avoid its “Bible” neurons when doing math

...See, that's not how it works (you do not "exclude golfing movements" when you "pilot a helicopter").

Terminal AI Coding Agents Comparison Table

https://terminaltrove.com/compare/ai-coding-agents/
1•pelcg•1m ago•0 comments

What we learned at YC's AI alumni event: running AI-native companies

https://antonyevans.com/blog/engineering/c068-coo-job-is-building-ai-operating-system/
1•technotony•2m ago•0 comments

The Human Creativity Benchmark – Evaluating Generative AI in Creative Work

https://contralabs.com/research/human-creativity-benchmark
1•0bytematt•3m ago•0 comments

ReadBetweenTheWalls – AI Jargon Translator

https://www.readbetweenthewalls.com
1•creatorcuffee•4m ago•0 comments

When the Partnership Breaks, So Does the Company: The Jawbone Story

https://joshcarterpdx.substack.com/p/when-the-partnership-breaks-so-does
1•josh_carterPDX•4m ago•0 comments

Pedometer++ 8.0 Brings a Redesigned Apple Watch App

https://pedometer.app/blog/v8-new-watch-app
1•Amorymeltzer•6m ago•0 comments

Public content is the most average content

https://markferraz.com/perspective#essay-009
1•markferraz•6m ago•0 comments

Illegal cars on Market Street surge after Mayor Lurie welcomes Waymo

https://www.sfgate.com/local/article/cars-market-street-22234601.php
2•mikhael•8m ago•0 comments

AI Infrastructure–Not Models–Will Define the Next Decade of AI

https://alltechmagazine.com/ai-infrastructure-crisis-slowing-down-ai-progress/
1•dekhna•8m ago•0 comments

Haskell: Debugging

https://wiki.haskell.org/Debugging
2•tosh•11m ago•0 comments

UK votes to leave EU (2016)

https://www.bbc.com/news/uk-politics-36615028
1•downbad_•13m ago•2 comments

Reverse-Engineering Kasada in an Afternoon

https://www.kernel.sh/blog/detection
1•gguergabo•13m ago•0 comments

Under the canopy of drone nets covering Ukraine's frontlines

https://www.reuters.com/pictures/under-canopy-drone-nets-covering-ukraines-frontlines-2026-04-29
2•bryan0•13m ago•0 comments

The LLM Is Not a Junior Engineer

https://jacobharr.is/personal/llm-not-junior-engineer
3•speckx•14m ago•0 comments

Ekala's Nix Book

https://ekala-project.github.io/nix-book/
2•Meleagris•15m ago•0 comments

Elon Musk Seemingly Admits xAI Has Used OpenAI's Models to Train Its Own

https://www.wired.com/story/elon-musk-distill-openai-models-partly-xai/
3•louiereederson•17m ago•0 comments

Yrkit: A dev environment that runs on your phone – deploy included

https://yrkit.com
1•mwtheus•18m ago•1 comments

Building the compute infrastructure for the Intelligence Age

https://openai.com/index/building-the-compute-infrastructure-for-the-intelligence-age/
2•aghuang•19m ago•0 comments

Albert: A model-agnostic AI coding CLI with provider fallback

https://github.com/eriirfos-eng/ternary-intelligence-stack/tree/main/agent_albert_cli
1•rfi-irfos•19m ago•0 comments

GitHub Copilot silently inserts itself as a co-author

https://github.com/orgs/community/discussions/194075
5•krikou•20m ago•1 comments

GPT-5.5 is the second model to complete AISI multi-step cyber-attack simulation

https://twitter.com/AISecurityInst/status/2049868227740565890
3•SyneRyder•21m ago•1 comments

Full-Text Search with DuckDB

https://peterdohertys.website/blog-posts/full-text-search-w-duckdb.html
2•ethagnawl•22m ago•0 comments

Benchmarking Local LLM/Harness Combinations

https://neuralnoise.com///2026/harness-bench-wip/
1•pminervini•24m ago•0 comments

Cyborg Evals

https://www.lesswrong.com/posts/zctBgvzxamFThgc3T/cyborg-evals
2•frmsaul•24m ago•1 comments

LinuxOnTab: Real Linux. In a browser tab.

https://linuxontab.com/
1•kilian-ai•25m ago•0 comments

The Evolution of Open Source with Kelsey Hightower [video]

https://www.youtube.com/watch?v=a5-zTLJprpU
2•mooreds•26m ago•0 comments

Anthropic wants to be the AWS of agentic AI

https://thenewstack.io/anthropic-agents-managed-aws-claude/
2•Brajeshwar•26m ago•0 comments

Tess Observations

https://tess.mit.edu/
1•mooreds•27m ago•0 comments

What is Windows K2? Inside Microsoft's big plan to save Windows 11

https://www.windowscentral.com/microsoft/windows-11/what-is-windows-k2-everything-you-need-to-kno...
1•robotnikman•27m ago•2 comments

What Happens in the First 24 Hours After a New Asset Goes Live

https://www.bleepingcomputer.com/news/security/what-happens-in-the-first-24-hours-after-a-new-ass...
1•mooreds•28m ago•0 comments