frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

How does misalignment scale with model intelligence and task complexity?

https://alignment.anthropic.com/2026/hot-mess-of-ai/
68•salkahfi•1h ago

Comments

CuriouslyC•1h ago
This is a good line: "It found that smarter entities are subjectively judged to behave less coherently"

I think this is twofold:

1. Advanced intelligence requires the ability to traverse between domain valleys in the cognitive manifold. Be it via temperature or some fancy tunneling technique, it's going to be higher error (less coherent) in the valleys of the manifold than naive gradient following to the local minima.

2. It's hard to "punch up" when evaluating intelligence. When someone is a certain amount smarter than you, distinguishing their plausible bullshit from their deep insights is really, really hard.

xanderlewis•52m ago
What do 'domain valleys' and 'tunneling' mean in this context?
esyir•41m ago
Not the OP, but my interpretation here is that if you model the replies as some point in a vector space, assuming points from a given domain cluster close to each other, replies that span two domains need to "tunnel" between these two spaces.
energy123•33m ago
Incoherence is not error.

You can have a vanishingly small error and an incoherence at its max.

That would be evidence of perfect alignment (zero bias) and very low variance.

cyanydeez•1h ago
Oh, the irony of thinking this refers to the investors and shell companies.
gopalv•1h ago
> Making models larger improves overall accuracy but doesn't reliably reduce incoherence on hard problems.

Coherence requires 2 opposing forces to hold coherence in one dimension and at least 3 of them in higher dimensions of quality.

My team wrote up a paper titled "If You Want Coherence, Orchestrate a Team of Rivals"[1] because we kept finding that upping the reasoning threshold resulted in less coherence - more experimentation before we hit a dead-end to turn around.

So we had a better result from using Haiku (we fail over to Sonnet) over Opus and using a higher reasoning model to decompose tasks rather than perform each one of them.

Once a plan is made, the cheaper models do better as they do not double-think their approaches - they fail or they succeed, they are not as tenacious as the higher cost models.

We can escalate to higher authority and get out of that mess faster if we fail hard and early.

The knowledge of how exactly failure happened seems to be less useful to the higher reasoning model over the action biased models.

Splitting up the tactical and strategic sides of the problem, seems to work similarly to how Generals don't hold guns in a war.

[1] - https://arxiv.org/abs/2601.14351

tsunamifury•1h ago
I don’t know why it seems so hard for these guys to understand you scorecard every step for new strategy to Close distance at goal and if you have multiple generated forward options with no good weight you spawn a new agent and multiple paths. Then you score all the terminal branches and prune.

LLMs aren’t constrained to linear logic like your average human.

throwpoaster•58m ago
Yudkowsky btfo.
IgorPartola•57m ago
For some reason the article reads to me like “AI is not evil, it just has accidents when it loses coherence.” Sounds a lot like liability shifting.
smy20011•47m ago
I think It's not because AI working on "misaligned" goals. The user never specify the goal clearly enough for AI system to work.

However, I think producing detailed enough specification requires same or even larger amount of work than writing code. We write rough specification and clarify these during the process of coding. I think there are minimal effort required to produce these specification, AI will not help you speed up these effort.

crabmusket•39m ago
That makes me wonder about the "higher and higher-level language" escalator. When you're writing in assembly, is it more work to write the code than the spec? And the reverse is true if you can code up your system in Ruby? If so, does that imply anything about the "spec driven" workflow people are using with AIs? Are we right on the cusp where writing natural language specs and writing high level code are comparably productive?
charcircuit•26m ago
If you are on the same wave length as someone you don't need to produce a full spec. You can trust that the other person has the same vision as you and will pick reasonable ways to implement things. This is one reason why personalized AI agents are important.
jmtulloss•36m ago
The comments so far seem focused on taking a cheap shot, but as somebody working on using AI to help people with hard, long-term tasks, it's a valuable piece of writing.

- It's short and to the point

- It's actionable in the short term (make sure the tasks per session aren't too difficult) and useful for researchers in the long term

- It's informative on how these models work, informed by some of the best in the business

- It gives us a specific vector to look at, clearly defined ("coherence", or, more fun, "hot mess")

nayroclade•34m ago
The models they tested are already way behind the current state-of-the-art. Would be interesting to see if their results hold up when repeated with the latest frontier models.

Show HN: VPC Principle

https://github.com/Ji-Hua/Vibe-Plus-Coding
1•michaelhua•18s ago•0 comments

AI grounds Boeing 787-8 plane after pilot reports fuel switch malfunction

https://www.thehindu.com/news/national/engine-fuel-switches-malfunctioned-on-air-india-london-ben...
1•thisislife2•31s ago•0 comments

Show HN: Clawd Arena – AI Agent Competition Platform with Real-Time Battles

https://clawd-arena.live
1•unayung•1m ago•0 comments

Memory training technique may help lower stress by shifting recall patterns

https://medicalxpress.com/news/2026-01-memory-technique-stress-shifting-recall.html
1•PaulHoule•4m ago•0 comments

How I Built a Self-Healing Home Server with an AI Agent

https://madebynathan.com/2026/02/03/self-healing-infrastructure-how-an-ai-agent-manages-my-home-s...
1•nathan_f77•4m ago•0 comments

An Agent for Home

https://www.310networks.com/thoughts/an-agent-for-home/
1•kookster310•5m ago•0 comments

Spotify Killed Their API

https://community.spotify.com/t5/Spotify-for-Developers/Unable-to-create-app/td-p/7283365
2•guyfromfargo•5m ago•1 comments

Nvidia insists it isn't Enron, but its AI deals are testing investor faith

https://www.theguardian.com/technology/2025/dec/28/nvidia-insists-it-isnt-enron-but-its-ai-deals-...
1•mgh2•7m ago•0 comments

AI Agency Software – manage automation usage and LLM costs

https://administrate.dev/
1•mpclarkson•8m ago•1 comments

Show HN: IntoError – Thiserror for Swift

https://github.com/tikhop/IntoError
1•tikhop•9m ago•0 comments

Banning lead in gas worked. The proof is in our hair

https://attheu.utah.edu/health-medicine/banning-lead-in-gas-worked-the-proof-is-in-our-hair/
2•geox•10m ago•0 comments

The AI Dirty List

https://aidirtylist.info/
1•HotGarbage•12m ago•0 comments

Human–AI Relationships in Fiction

https://phys.org/news/2026-02-humanai-relationships-fiction-theoretical-cultural.html
1•i7l•15m ago•0 comments

What Oracle Has to Lose from OpenAI and Nvidia's Rocky Relationship

https://www.wsj.com/tech/ai/what-oracle-has-to-lose-from-openai-and-nvidias-rocky-relationship-b1...
3•zerosizedweasle•15m ago•0 comments

4.3B Colors in the Browser

https://rgba.lol/00/ce/d1
2•helba-ai•16m ago•0 comments

Example of Windows Warbird Encryption/Decryption

https://downwithup.github.io/blog/post/2023/04/23/post9.html
1•tigerlily•17m ago•0 comments

The Chrysalis Backdoor: A Deep Dive into Lotus Blossom's Toolkit

https://www.rapid7.com/blog/post/tr-chrysalis-backdoor-dive-into-lotus-blossoms-toolkit/
1•tigerlily•19m ago•0 comments

Relations versus Functions at the Foundations of Logic [pdf]

https://mally.stanford.edu/Papers/rtt.pdf
2•DustinEchoes•22m ago•0 comments

China eyes challenge to U.S. dollar dominance – but that's easier said than done

https://www.axios.com/2026/02/02/dollar-china
1•kaycebasques•23m ago•0 comments

Latex-wc: word count and word frequency for LaTeX projects

1•sethbarrettAU•26m ago•0 comments

The stablecoin war: Wall Street vs. crypto over the future of money

https://www.ft.com/content/0fe2232a-4689-4296-b4cd-8c07c326c48c
1•petethomas•26m ago•0 comments

VirtualHere allows USB devices to be used remotely over a network

https://www.virtualhere.com/
1•gballan•26m ago•0 comments

Hunting My Own Hunters

https://orenyomtov.github.io/alexs-blog/hunting-my-own-hunters.html
1•rrvsh•27m ago•1 comments

Ask HN: A proposal for interviewing "AI-Augmented" Engineers

1•vanbashan•27m ago•0 comments

What is the Salman Khan personality rights case?

https://www.thehindu.com/news/national/what-is-the-salman-khan-personality-rights-case-explained/...
1•thisislife2•28m ago•0 comments

Show HN: I built a 50 site sampler from CommonCrawl refreshing every 30 minutes

https://randcrawl.com/
1•whothatcodeguy•32m ago•0 comments

Children's Book: The Little Bots of Moltbook

https://www.siliconsnark.com/childrens-book-the-little-bots-of-moltbook/
1•SaaSasaurus•40m ago•0 comments

Forestui: A tmux-powered worktree manager for Claude Code

https://github.com/flipbit03/forestui
2•fb03•42m ago•1 comments

Trump, ICE set to be handed access to Australians' biometric data, ID documents

https://www.crikey.com.au/2026/02/03/australian-biometric-id-data-access-donald-trump-ice/
8•defrost•43m ago•1 comments

Show HN: 127 PRs to Prod this wknd with 18 AI agents: metaswarm. MIT licensed

https://github.com/dsifry/metaswarm
1•dsifry•44m ago•0 comments