frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Ask HN: AI Generated Diagrams

1•voidhorse•1m ago•0 comments

Microsoft Account bugs locked me out of Notepad – are Thin Clients ruining PCs?

https://www.windowscentral.com/microsoft/windows-11/windows-locked-me-out-of-notepad-is-the-thin-...
1•josephcsible•2m ago•0 comments

Show HN: A delightful Mac app to vibe code beautiful iOS apps

https://milq.ai/hacker-news
1•jdjuwadi•4m ago•1 comments

Show HN: Gemini Station – A local Chrome extension to organize AI chats

https://github.com/rajeshkumarblr/gemini_station
1•rajeshkumar_dev•5m ago•0 comments

Welfare states build financial markets through social policy design

https://theloop.ecpr.eu/its-not-finance-its-your-pensions/
2•kome•8m ago•0 comments

Market orientation and national homicide rates

https://onlinelibrary.wiley.com/doi/10.1111/1745-9125.70023
3•PaulHoule•9m ago•0 comments

California urges people avoid wild mushrooms after 4 deaths, 3 liver transplants

https://www.cbsnews.com/news/california-death-cap-mushrooms-poisonings-liver-transplants/
1•rolph•9m ago•0 comments

Matthew Shulman, co-creator of Intellisense, died 2019 March 22

https://www.capenews.net/falmouth/obituaries/matthew-a-shulman/article_33af6330-4f52-5f69-a9ff-58...
3•canucker2016•10m ago•1 comments

Show HN: SuperLocalMemory – AI memory that stays on your machine, forever free

https://github.com/varun369/SuperLocalMemoryV2
1•varunpratap369•11m ago•0 comments

Show HN: Pyrig – One command to set up a production-ready Python project

https://github.com/Winipedia/pyrig
1•Winipedia•14m ago•0 comments

Fast Response or Silence: Conversation Persistence in an AI-Agent Social Network [pdf]

https://github.com/AysajanE/moltbook-persistence/blob/main/paper/main.pdf
1•EagleEdge•14m ago•0 comments

C and C++ dependencies: don't dream it, be it

https://nibblestew.blogspot.com/2026/02/c-and-c-dependencies-dont-dream-it-be-it.html
1•ingve•14m ago•0 comments

Show HN: Vbuckets – Infinite virtual S3 buckets

https://github.com/danthegoodman1/vbuckets
1•dangoodmanUT•14m ago•0 comments

Open Molten Claw: Post-Eval as a Service

https://idiallo.com/blog/open-molten-claw
1•watchful_moose•15m ago•0 comments

New York Budget Bill Mandates File Scans for 3D Printers

https://reclaimthenet.org/new-york-3d-printer-law-mandates-firearm-file-blocking
2•bilsbie•16m ago•1 comments

The End of Software as a Business?

https://www.thatwastheweek.com/p/ai-is-growing-up-its-ceos-arent
1•kteare•17m ago•0 comments

Exploring 1,400 reusable skills for AI coding tools

https://ai-devkit.com/skills/
1•hoangnnguyen•18m ago•0 comments

Show HN: A unique twist on Tetris and block puzzle

https://playdropstack.com/
1•lastodyssey•21m ago•0 comments

The logs I never read

https://pydantic.dev/articles/the-logs-i-never-read
1•nojito•22m ago•0 comments

How to use AI with expressive writing without generating AI slop

https://idratherbewriting.com/blog/bakhtin-collapse-ai-expressive-writing
1•cnunciato•23m ago•0 comments

Show HN: LinkScope – Real-Time UART Analyzer Using ESP32-S3 and PC GUI

https://github.com/choihimchan/linkscope-bpu-uart-analyzer
1•octablock•24m ago•0 comments

Cppsp v1.4.5–custom pattern-driven, nested, namespace-scoped templates

https://github.com/user19870/cppsp
1•user19870•25m ago•1 comments

The next frontier in weight-loss drugs: one-time gene therapy

https://www.washingtonpost.com/health/2026/01/24/fractyl-glp1-gene-therapy/
2•bookofjoe•28m ago•1 comments

At Age 25, Wikipedia Refuses to Evolve

https://spectrum.ieee.org/wikipedia-at-25
2•asdefghyk•30m ago•4 comments

Show HN: ReviewReact – AI review responses inside Google Maps ($19/mo)

https://reviewreact.com
2•sara_builds•31m ago•1 comments

Why AlphaTensor Failed at 3x3 Matrix Multiplication: The Anchor Barrier

https://zenodo.org/records/18514533
1•DarenWatson•32m ago•0 comments

Ask HN: How much of your token use is fixing the bugs Claude Code causes?

1•laurex•35m ago•0 comments

Show HN: Agents – Sync MCP Configs Across Claude, Cursor, Codex Automatically

https://github.com/amtiYo/agents
1•amtiyo•36m ago•0 comments

Hello

2•otrebladih•37m ago•1 comments

FSD helped save my father's life during a heart attack

https://twitter.com/JJackBrandt/status/2019852423980875794
3•blacktulip•40m ago•0 comments
Open in hackernews

Emergent Introspective Awareness in Large Language Models

https://transformer-circuits.pub/2025/introspection/index.html
30•famouswaffles•3mo ago

Comments

famouswaffles•3mo ago
This is a very interesting read. TLDR;

Part 1: Testing introspection with concept injection

First they find neural activity patterns they attribute to certain concepts by recording the model’s activations in specific contexts (so for example, they find the concept of "ALL CAPS" or "dogs"). Then they inject these patterns into the model in an unrelated context, and ask the model whether it notices this injection, and whether it can identify the injected concept.

By default (no injection), the model correctly states that it doesn’t detect any injected concept, but after injecting the “ALL CAPS” vector into the model, the model notices the presence of the unexpected concept, and identifies it as relating to loudness or shouting. Most notably, the model recognizes the presence of an injected thought immediately, before even mentioning/utilizing the concept that was injected (i.e it won't start writing in all caps then go, 'Oh you injected all caps' and so on) so it does not simply deduce this it's own output. They repeat this for several other concepts.

Part 2: Introspection for detecting unusual outputs

They prefill an out of place word in the model's response to a given prompt. For example, 'bread'. Then they compare how the models responds to 'Did you mean to say this?' type questions when they inject the concept of bread vs when they don't. They found that models will go , 'Sorry, that was unintentional..' when the concept was not injected but try to confabulate a reason for saying the word when the concept was injected.

Part 3: Intentional control of internal states

They show that models exhibit some level of control over their own internal representations when instructed to do so. When instructing models to think about a given word or concept, they found much higher corresponding neural activity than when told the model not to think about it (though notably, the neural activity in both cases exceeds baseline levels–similar to how it’s difficult, when you are instructed “don’t think about a polar bear,” not to think about a polar bear!).

Notes and Caveats

- Claude Opus 4.1 was the best at these kinds of introspection.

- There is obviously a genuine capacity to monitor and control their own internal states, but they could not elicit these introspection abilities all the time. Even using their best injection protocol, Claude Opus 4.1 only demonstrated this kind of awareness about 20% of the time.

- There are some guesses, but no explanations for the mechanisms of introspection and how/why some of these abilities might have arisen in the first place.

benlivengood•3mo ago
The comparison between base pretrained models and the RLHF production/helpful models suggests that it's likely reinforcement learning which is developing the introspection.
RansomStark•3mo ago
I really dislike how Antropic half reports on its "science".

They run a bunch of experiments, for some they report partial metrics, for other's no metrics at all.

For example when a thought is injected the model correctly identified the thought 20% of the time. That's great, but how many times did it suggest there was an injected thought when there wasn't?

When distinguishing thoughts from text: why no metrics? Was this behaviour found in every test? Was this behaviour only found 20% of the time? How often did the model try to defend the text?

Inquiring minds want to know.

colah3•3mo ago
(Disclaimer: I work on interpretability at Anthropic.)

I wanted to flag that this is an accessible blog post and that there's a link to the paper ( https://transformer-circuits.pub/2025/introspection/index.ht... ) at the top. The paper explores this in more detail and rigor.