frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: What Is Anthropic Doing?

9•d3ckard•1h ago
I'm without words with Opus 4.7. This is possibly the biggest quality regression of a computer product since Windows Vista. Or maybe ever.

So, here is my question to the community: what is the play here?

I don't buy the money angle, since they're about to be bleeding subscribers. I hardly can justify paying for Max anymore and switched to Codex at work first time in months after I spent 4 very frustrating hours trying to get the model to fix the code to my liking.

I also don't buy the corporate angle. Models are part of infrastructure, like APIs and the same rule apply to them. Mostly: don't break the integrations. Since new generation of models can completely break output expectations, deploying them becomes organizational effort and support for older ones is abysmal. At some point those organizations will get tired of "adapting" all the time.

So what's the play here?

Comments

blinkbat•1h ago
Floundering
metadat•1h ago
They don't have enough compute capacity relative to growth rate. The new tokenizer in Opus 4.7 hints at a coming foundational / architectural change. I expect the next point release will deliver decent results more efficiently.

Ymmv. I've been using GPT Codex 5.3 and now 5.4 for the past few months and it works great and is reliable.

troglodytetrain•1h ago
Anthropic definitely appears to be heavily hamstringing the LLM response quality as a sort of rate limiting implementation.

I've built my own custom coding harness at my slow corp job, as for some reason they give us unlimited Anthropic tokens here but only if used from their custom bespoke 'chatgpt' derivative website. However, since they expose the backend Api directly due to the questionable design decision of making all backend api calls via javascript on the client side, it has been possible to actually leverage these unlimited tokens via my 'openclaw we have at home' and its been a fun project.

But, In the last few days I've watched live, several times, as the tool use agent suddenly on a turn, completely forgets the correct tool call tags clearly defined in its system prompt, hallucinating a completely new tool call format that I have never seen before, before weirdly fixing itself some minutes and some turns later. Literally never been a problem before across at this point hundreds of hours of dev time and thousands of euros of token spend.

That, in addition to a. New refusals from agent for same prompt that worked fine before, and b. A large amount of cloudflare forbidden responses during specific times of the day.

As I am in EU currently, I notice its been happening in the late evening for me, around 8pm (12-3pm CST), which I presume is peak usage times in US.

federicchauvat•53m ago
Interesting — I hadn't tracked the hours on my side. A small community tool to collect this would help. The hard part is separating "the model got nerfed" from "my prompts don't fit the new behavior anymore". Think downdetector for LLMs, but based on real metrics instead of user reports. Opt-in client wrapper, anonymized telemetry, public dashboard. Does it exist already? I just searched and couldn't find anything.
ai-tamer•1h ago
Same. The numbers match your feel. Going from 4.6 to 4.7: +14.6 on MCP-Atlas, +10.9 on SWE-bench Pro, tool errors cut by two-thirds. But BrowseComp dropped 4.7 points. Anthropic's own announcement says 4.7 "takes the instructions literally" where 4.6 interpreted them loosely, and recommends re-tuning prompts accordingly. In a conversational loop with an opinionated developer, that translates to less quality because less reasoning — the model executes instead of thinking through. https://llm-stats.com/blog/research/claude-opus-4-7-vs-opus-... https://www.anthropic.com/news/claude-opus-4-7

Anthropic Mythos model accessed by unauthorized users

https://www.reuters.com/technology/anthropics-mythos-model-accessed-by-unauthorized-users-bloombe...
1•flawn•1m ago•0 comments

SpaceX Said to Agree to Buy Cursor for More Than $50B

https://www.nytimes.com/2026/04/21/business/spacex-cursor-deal.html
9•markthethomas•4m ago•0 comments

Navy Fires Drone-Frying LOCUST Laser from Supercarrier USS George H.W. Bush

https://www.twz.com/news-features/navy-fires-dronefrying-locust-laser-from-supercarrier-uss-georg...
1•breve•4m ago•0 comments

I built a client-side PDF tool (no uploads, runs in browser)

https://yutools.qzz.io/tools
1•zhiliao000•8m ago•0 comments

Mozilla: Anthropic's Mythos found 271 security vulnerabilities in Firefox 150

https://arstechnica.com/ai/2026/04/mozilla-anthropics-mythos-found-271-zero-day-vulnerabilities-i...
1•ndr42•8m ago•0 comments

Escom AG

https://en.wikipedia.org/wiki/Escom_AG
1•doener•9m ago•0 comments

Flu vaccine no longer mandated for US troops

https://apnews.com/article/hegseth-pentagon-flu-vaccine-mandate-us-military-ce6069bf42de217092f9c...
2•petethomas•10m ago•0 comments

I Put a Full JVM Inside a Browser Tab. It "Works". Technically. Eventually

https://bmarti44.substack.com/p/i-put-a-full-jvm-inside-a-browser
1•SerCe•10m ago•0 comments

Access to knowledge is no longer the limitation

https://idiallo.com/blog/access-to-knowledge-is-no-longer-a-limitation
1•ibudiallo•13m ago•0 comments

What I Learned About Billionaires at Jeff Bezos's Private Retreat

https://www.theatlantic.com/magazine/2026/05/billionaire-consequence-free-reality/686588/
2•mykowebhn•13m ago•1 comments

Show HN: Should – Expressive Assertions for Go

1•andrey-1201•15m ago•0 comments

Making the illustrations for "Founding is a Snowball"

https://blog.bawolf.com/p/making-the-illustrations-for-founding
1•bryantwolf•16m ago•0 comments

SpaceX and Cursor partnership. Right to acquire Cursor later this year

https://twitter.com/spacex/status/2046713419978453374
13•dmarcos•18m ago•1 comments

Weev, the Neo-Nazi Who Helped Build Peter Thiel's Online Influence Empire

https://bylinetimes.com/2026/04/14/the-neo-nazi-enforcer-who-helped-build-peter-thiels-online-inf...
7•tastyface•18m ago•0 comments

Show HN: Almanac MCP, turn Claude Code into a Deep Research agent

https://www.openalmanac.org/
2•rohans0509•18m ago•0 comments

Blue Origin's New Glenn rocket is grounded after failed satellite launch

https://apnews.com/article/blue-origin-new-glenn-rocket-launch-9498c077799420170960680a04e52f84
1•mpweiher•21m ago•0 comments

The levels of Mong Kok: one of Hong Kong's labyrinthine camera malls

https://www.dpreview.com/articles/7475558634/inside-hong-kongs-labyrinthine-camera-malls
2•PaulHoule•22m ago•0 comments

LLM Position Bias Benchmark: Swapped-Order Pairwise Judging

https://github.com/lechmazur/position_bias
1•zone411•23m ago•0 comments

Indeed paid plans notify employers when an employee applies to different jobs

https://old.reddit.com/r/antiwork/comments/1sru8d6/til_indeed_can_be_paid_to_notify_employers_when/
2•thisislife2•23m ago•2 comments

AI as a Fascist Artifact

https://tante.cc/2026/04/21/ai-as-a-fascist-artifact/
4•vladyslavfox•23m ago•0 comments

New cancer cluster feared in N.J. neighborhood

https://www.nj.com/news/2026/04/new-cancer-cluster-feared-in-nj-neighborhood.html
1•johntfella•23m ago•0 comments

FBI: Catching a Cuban Spy

https://www.fbi.gov/news/podcasts/inside-the-fbi-podcast-catching-a-cuban-spy
1•737min•24m ago•0 comments

Approaches to Tenancy in Postgres

https://planetscale.com/blog/approaches-to-tenancy-in-postgres
1•0xKelsey•25m ago•0 comments

Mesh3d Experiments

https://mesh3d.gallery/experiments
1•memalign•25m ago•0 comments

Impacts of updates in open-source databases

https://www.percona.com/blog/impacts-of-updates-in-open-source-databases/
1•0xKelsey•25m ago•0 comments

Content Scraping Issue: Risks and Dangers of Artificial Intelligence

https://sites.google.com/view/amenintare-gemini/
1•marinescu•30m ago•0 comments

Anthropic's Mythos Model Is Being Accessed by Unauthorized Users

https://www.bloomberg.com/news/articles/2026-04-21/anthropic-s-mythos-model-is-being-accessed-by-...
5•mfiguiere•33m ago•1 comments

The Birth Certificate for AI Agents

https://dnsid.ai
1•cdrnsf•36m ago•0 comments

Yojam: A macOS default-browser shim that routes URLs through a rule engine

https://github.com/fluffypony/yojam
2•birdculture•39m ago•0 comments

Embedded 191,922 Met artworks to find hidden twins across 4k years

https://jmp1062.github.io/met-weirdest-art/
1•jperryjperry•41m ago•1 comments