frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

How does misalignment scale with model intelligence and task complexity?

https://alignment.anthropic.com/2026/hot-mess-of-ai/
75•salkahfi•2h ago

Comments

CuriouslyC•1h ago
This is a good line: "It found that smarter entities are subjectively judged to behave less coherently"

I think this is twofold:

1. Advanced intelligence requires the ability to traverse between domain valleys in the cognitive manifold. Be it via temperature or some fancy tunneling technique, it's going to be higher error (less coherent) in the valleys of the manifold than naive gradient following to the local minima.

2. It's hard to "punch up" when evaluating intelligence. When someone is a certain amount smarter than you, distinguishing their plausible bullshit from their deep insights is really, really hard.

xanderlewis•1h ago
What do 'domain valleys' and 'tunneling' mean in this context?
esyir•1h ago
Not the OP, but my interpretation here is that if you model the replies as some point in a vector space, assuming points from a given domain cluster close to each other, replies that span two domains need to "tunnel" between these two spaces.
energy123•1h ago
Incoherence is not error.

You can have a vanishingly small error and an incoherence at its max.

That would be evidence of perfect alignment (zero bias) and very low variance.

p-e-w•26m ago
> When someone is a certain amount smarter than you, distinguishing their plausible bullshit from their deep insights is really, really hard.

Insights are “deep” not on their own merit, but because they reveal something profound about reality. Such a revelation is either testable or not. If it’s testable, distinguishing it from bullshit is relatively easy, and if it’s not testable even in principle, a good heuristic is to put it in the bullshit category by default.

skydhash•18m ago
The issue is the revelation. It's always individual at some level. And don't forget our senses are crude. The best way is to store "insights" as information until we collect enough data that we can test it again (hopefully without a lot of bias). But that can be more than a lifetime work, so sometimes you have to take some insights at face value based on heuristics (parents, teachers, elder, authority,...)
cyanydeez•1h ago
Oh, the irony of thinking this refers to the investors and shell companies.
gopalv•1h ago
> Making models larger improves overall accuracy but doesn't reliably reduce incoherence on hard problems.

Coherence requires 2 opposing forces to hold coherence in one dimension and at least 3 of them in higher dimensions of quality.

My team wrote up a paper titled "If You Want Coherence, Orchestrate a Team of Rivals"[1] because we kept finding that upping the reasoning threshold resulted in less coherence - more experimentation before we hit a dead-end to turn around.

So we had a better result from using Haiku (we fail over to Sonnet) over Opus and using a higher reasoning model to decompose tasks rather than perform each one of them.

Once a plan is made, the cheaper models do better as they do not double-think their approaches - they fail or they succeed, they are not as tenacious as the higher cost models.

We can escalate to higher authority and get out of that mess faster if we fail hard and early.

The knowledge of how exactly failure happened seems to be less useful to the higher reasoning model over the action biased models.

Splitting up the tactical and strategic sides of the problem, seems to work similarly to how Generals don't hold guns in a war.

[1] - https://arxiv.org/abs/2601.14351

tsunamifury•1h ago
I don’t know why it seems so hard for these guys to understand you scorecard every step for new strategy to Close distance at goal and if you have multiple generated forward options with no good weight you spawn a new agent and multiple paths. Then you score all the terminal branches and prune.

LLMs aren’t constrained to linear logic like your average human.

throwpoaster•1h ago
Yudkowsky btfo.
IgorPartola•1h ago
For some reason the article reads to me like “AI is not evil, it just has accidents when it loses coherence.” Sounds a lot like liability shifting.
smy20011•1h ago
I think It's not because AI working on "misaligned" goals. The user never specify the goal clearly enough for AI system to work.

However, I think producing detailed enough specification requires same or even larger amount of work than writing code. We write rough specification and clarify these during the process of coding. I think there are minimal effort required to produce these specification, AI will not help you speed up these effort.

crabmusket•1h ago
That makes me wonder about the "higher and higher-level language" escalator. When you're writing in assembly, is it more work to write the code than the spec? And the reverse is true if you can code up your system in Ruby? If so, does that imply anything about the "spec driven" workflow people are using with AIs? Are we right on the cusp where writing natural language specs and writing high level code are comparably productive?
charcircuit•54m ago
If you are on the same wave length as someone you don't need to produce a full spec. You can trust that the other person has the same vision as you and will pick reasonable ways to implement things. This is one reason why personalized AI agents are important.
skydhash•26m ago
Programming languages can be a thinking tool for a lot of tasks. Very much like a lot of notation, like music sheet and map drawing. A condensed and somewhat formal manner of describing ideas can increase communication speed. It may lack nuance, but in some case, nuance is harmful.

The nice thing about code compared to other notation is that it's useful on its. You describe an algorithm and the machine can then solve the problem ad infinitum. It's one step instead of the two step of writing a spec and having an LLM translate it, then having to verify the output and alter it.

Assembly and high level languages are equivalent in terms of semantics. The latter helps in managing complexity, by reducing harmful possibilities (managing memory, off-by-one errors) and presenting common patterns (iterators/collections, struct and other data structures, ....) so that categories of problems are easily solved. There's no higher level of computing model unlocked. Just faster level of productivity unlocked by following proven patterns.

Spec driven workflow is a mirage, because even the best specs will leave a lot of unspecified details. Which are crucial as most of programming is making the computer not do the various things it can do.

crabmusket•14m ago
> most of programming is making the computer not do the various things it can do

This is a very stimulating way of putting it!

jmtulloss•1h ago
The comments so far seem focused on taking a cheap shot, but as somebody working on using AI to help people with hard, long-term tasks, it's a valuable piece of writing.

- It's short and to the point

- It's actionable in the short term (make sure the tasks per session aren't too difficult) and useful for researchers in the long term

- It's informative on how these models work, informed by some of the best in the business

- It gives us a specific vector to look at, clearly defined ("coherence", or, more fun, "hot mess")

kernc•9m ago
Other actionable insights are:

- Merge up amendments into the initial prompt.

- Evaluate a prompt multiple times (ensemble).

nayroclade•1h ago
The models they tested are already way behind the current state-of-the-art. Would be interesting to see if their results hold up when repeated with the latest frontier models.

How does misalignment scale with model intelligence and task complexity?

https://alignment.anthropic.com/2026/hot-mess-of-ai/
75•salkahfi•2h ago•19 comments

The Codex App

https://openai.com/index/introducing-the-codex-app/
527•meetpateltech•8h ago•357 comments

Anki ownership transferred to AnkiHub

https://forums.ankiweb.net/t/ankis-growing-up/68610
233•trms•5h ago•63 comments

GitHub experience various partial-outages/degradations

https://www.githubstatus.com?todayis=2026-02-02
150•bhouston•5h ago•38 comments

xAI joins SpaceX

https://www.spacex.com/updates#xai-joins-spacex
476•g-mork•4h ago•1067 comments

The Connection Machine CM-1 "Feynman" T-shirt

https://tamikothiel.com/cm/cm-tshirt.html
31•tosh•3d ago•11 comments

Julia

https://borretti.me/fiction/julia
42•ashergill•3h ago•4 comments

Ask HN: Who is hiring? (February 2026)

240•whoishiring•10h ago•301 comments

Hacking Moltbook

https://www.wiz.io/blog/exposed-moltbook-database-reveals-millions-of-api-keys
247•galnagli•10h ago•155 comments

Court orders restart of all US offshore wind power construction

https://arstechnica.com/science/2026/02/court-orders-restart-of-all-us-offshore-wind-construction/
216•ck2•3h ago•96 comments

Joedb, the Journal-Only Embedded Database

https://www.joedb.org/index.html
40•mci•3d ago•6 comments

Firefox Getting New Controls to Turn Off AI Features

https://www.macrumors.com/2026/02/02/firefox-ai-toggle/
76•stalfosknight•2h ago•29 comments

Carnegie Mellon Unversity Computer Club FTP Server

http://128.237.157.9/pub/
12•1vuio0pswjnm7•4d ago•2 comments

4x faster network file sync with rclone (vs rsync) (2025)

https://www.jeffgeerling.com/blog/2025/4x-faster-network-file-sync-rclone-vs-rsync/
263•indigodaddy•3d ago•134 comments

Advancing AI Benchmarking with Game Arena

https://blog.google/innovation-and-ai/models-and-research/google-deepmind/kaggle-game-arena-updates/
110•salkahfi•8h ago•47 comments

Training a trillion parameter model to be funny

https://jokegen.sdan.io/blog
17•sdan•6d ago•11 comments

Nano-vLLM: How a vLLM-style inference engine works

https://neutree.ai/blog/nano-vllm-part-1
217•yz-yu•13h ago•24 comments

The largest number representable in 64 bits

https://tromp.github.io/blog/2026/01/28/largest-number-revised
82•tromp•8h ago•58 comments

Todd C. Miller – Sudo maintainer for over 30 years

https://www.millert.dev/
301•wodniok•9h ago•177 comments

Zig Libc

https://ziglang.org/devlog/2026/#2026-01-31
156•ingve•9h ago•58 comments

Geologists may have solved mystery of Green River's 'uphill' route

https://phys.org/news/2026-01-geologists-mystery-green-river-uphill.html
146•defrost•13h ago•37 comments

Ask HN: Who wants to be hired? (February 2026)

100•whoishiring•10h ago•242 comments

Pretty soon, heat pumps will be able to store and distribute heat as needed

https://www.sintef.no/en/latest-news/2026/pretty-soon-heat-pumps-will-be-able-to-store-and-distri...
146•PaulHoule•1d ago•127 comments

Why software stocks are getting pummelled

https://www.economist.com/business/2026/02/01/why-software-stocks-are-getting-pummelled
145•petethomas•21h ago•204 comments

GitHub discusses giving maintainers control to disable PRs

https://github.com/orgs/community/discussions/185387
19•aofeisheng•2h ago•4 comments

Show HN: Adboost – A browser extension that adds ads to every webpage

https://github.com/surprisetalk/AdBoost
98•surprisetalk•13h ago•109 comments

IsoCoaster – Theme Park Builder

https://iso-coaster.com/
100•duck•3d ago•25 comments

UK government launches fuel forecourt price API

https://www.gov.uk/guidance/access-the-latest-fuel-prices-and-forecourt-data-via-api-or-email
90•Technolithic•13h ago•107 comments

Banning lead in gas worked. The proof is in our hair

https://attheu.utah.edu/health-medicine/banning-lead-in-gas-worked-the-proof-is-in-our-hair/
9•geox•39m ago•1 comments

Nvidia shares are down after report that its OpenAI investment stalled

https://www.cnbc.com/2026/02/02/nvidia-stock-price-openai-funding.html
109•greatgib•6h ago•47 comments