frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Airline adds bunk beds for economy travelers but bans snacks, smells and cuddles

https://apnews.com/article/air-new-zealand-economy-bunk-beds-sleep-c2c434f60116f332c0ce96c69d662c3b
1•randycupertino•1m ago•0 comments

Build Agents that never forget

https://twitter.com/akshay_pachaar/status/2043745099792953508
1•gmays•3m ago•0 comments

From Future of Work to Future of Workers: Addressing Asymptomatic AI Harms

https://arxiv.org/abs/2601.21920
1•laurex•4m ago•0 comments

Looking for feedback on a paper about a revision-capable language model [pdf]

https://github.com/Sean-Diab/Reviser/blob/main/main.pdf
1•param-updater•9m ago•1 comments

Under the hood of MDN's new front end

https://developer.mozilla.org/en-US/blog/mdn-front-end-deep-dive/
1•caisah•11m ago•0 comments

OpenAI expands Codex beyond coding with computer use, memory, and plugins

https://www.neowin.net/news/openai-expands-codex-beyond-coding-with-computer-use-memory-and-plugins/
1•Brajeshwar•12m ago•1 comments

"AI Affiliate Campaign Builder – Auto-generates funnels,leads and promos in 60s"

https://3000-ixuoqvbqmnmkcitl7dir1-6ba1a608.us2.manus.computer
1•rooseveltc•12m ago•0 comments

Recall issued for power banks after explosion kills woman

https://www.cpsc.gov/Recalls/2026/Casely-Reannounces-Recall-of-Wireless-Portable-Power-Banks-Due-...
1•labelbabyjunior•12m ago•0 comments

Closed Source Is a Business Decision, Not Security

https://javiergonzalez.io/blog/closed-source-as-a-security-model/
1•javier123454321•15m ago•0 comments

The Patchwright – Cyberpunk Short Film [video]

https://www.youtube.com/watch?v=-Rzl7nUdEs4
2•daureg•15m ago•1 comments

International standard paper sizes: A series

https://en.wikipedia.org/wiki/International_standard_paper_sizes
1•doener•16m ago•0 comments

Anthropic's Nuclear Bomb

https://warontherocks.com/cogs-of-war/anthropics-nuclear-bomb/
2•azanar•17m ago•0 comments

Show HN: PanicLock – Close your MacBook lid disable TouchID –> password unlock

https://github.com/paniclock/paniclock/
1•seanieb•17m ago•0 comments

SETI may have been tuned to the wrong frequencies

https://iopscience.iop.org/article/10.3847/1538-4357/ae3d33
1•johnbarron•18m ago•0 comments

I built an on-premise ERP for wholesale distributors in Delphi

https://asktheledger.com/
1•josephsprei•20m ago•0 comments

Show HN: Clamp – Web analytics your AI agent can read and query

https://clamp.sh
1•sidneyottelohe•21m ago•1 comments

The Future of Testing Is Here

https://testkube.wistia.com/live/events/gigwl708fn
1•evwitmer•22m ago•1 comments

Vectary Canvas: AI-accelerated ideation across 2D, 3D and AR

https://www.vectary.com/waitlist/
3•mkoor•22m ago•0 comments

The Value of a Performance Oracle

https://wingolog.org/archives/2026/04/07/the-value-of-a-performance-oracle
1•abnercoimbre•23m ago•0 comments

The Internet's Most Powerful Archiving Tool Is in Peril

https://www.wired.com/story/the-internets-most-powerful-archiving-tool-is-in-mortal-peril/
4•doener•25m ago•0 comments

Bringing BitNet to ExecuTorch via Vulkan

https://www.collabora.com/news-and-blog/blog/2026/04/17/bringing-bitnet-to-executorch-via-vulkan/
3•losgehts•26m ago•0 comments

European Space Agency, more than 400 job opportunities in 2026

https://www.esa.int/About_Us/Careers_at_ESA/A_stellar_year_for_talent_more_than_400_job_opportuni...
2•johnbarron•27m ago•0 comments

Who will maintain the web when PHP's veterans retire?

https://thenewstack.io/php-web-skills-hiring-age/
2•Brajeshwar•27m ago•1 comments

Long-Tail Knowledge in Large Language Models

https://arxiv.org/abs/2602.16201
1•wslh•29m ago•0 comments

AI's Mainframe Moment

https://www.mjeggleton.com/blog/AIs-mainframe-moment
3•lelanthran•30m ago•0 comments

Where Enterprises Are Adopting AI

https://a16z.com/where-enterprises-are-actually-adopting-ai/
1•wslh•30m ago•0 comments

Apple's Mac Mini Went Viral. Why Can't You Buy One?

https://www.wsj.com/tech/personal-tech/apple-mac-mini-supply-3e7a7509
1•Anon84•31m ago•0 comments

Beyond Demo Day: Sorting and Value Added in Startup Accelerators

https://www.nber.org/papers/w35063
1•john_horton•32m ago•0 comments

Oil prices plunge as Iran says Strait of Hormuz 'open' during ceasefire

https://www.bbc.com/news/articles/ckg045z73z1o
3•geox•32m ago•0 comments

Hyperscalers have already outspent most famous US megaprojects

https://twitter.com/finmoorhouse/status/2044933442236776794
12•nowflux•33m ago•2 comments
Open in hackernews

Claude Opus 4.7 costs 20–30% more per session

https://www.claudecodecamp.com/p/i-measured-claude-4-7-s-new-tokenizer-here-s-what-it-costs-you
115•aray07•1h ago

Comments

uberman•1h ago
On actual code, I see what you see a 30% increase in tokens which is in-line with what they claim as well. I personally don't tend to feed technical documentation or random pros into llms.

Given that Opus 4.6 and even Sonnet 4.6 are still valid options, for me the question is not "Does 4.7 cost more than claimed?" but "What capabilities does 4.7 give me that 4.6 did not?"

Yesterday 4.6 was a great option and it is too soon for me to tell if 4.7 is a meaningful lift. If it is, then I can evaluate if the increased cost is justified.

pier25•57m ago
haven't people been complaining lately about 4.6 getting worse?
ed_elliott_asc•52m ago
No we increased our plans
solenoid0937•50m ago
People complain about a lot of things. Claude has been fine:

https://marginlab.ai/trackers/claude-code-historical-perform...

Majromax•36m ago
While that's a nice effort, the inter-run variability is too high to diagnose anything short of catastrophic model degradation. The typical 95% confidence interval runs from 35% to 65% pass rates, a full factor of two performance difference.

Moreover, on the companion codex graphs (https://marginlab.ai/trackers/codex-historical-performance/), you can see a few different GPT model releases marked yet none correspond to a visual break in the series. Either GPT 5.4-xhigh is no more powerful than GPT 5.2, or the benchmarking apparatus is not sensitive enough to detect such changes.

cbg0•35m ago
That performance monitor is super easy to game if you cache responses to all the SWE bench questions.
addisonj•29m ago
I will be the first to acknowledge that humans are a bad judge of performance and that some of the allegations are likely just hallucinations...

But... Are you really going to completely rely on benchmarks that have time and time again be shown to be gamed as the complete story?

My take: It is pretty clear that the capacity crunch is real and the changes they made to effort are in part to reduce that. It likely changed the experience for users.

grim_io•50m ago
How long will they host 4.6? Maybe longer for enterprise, but if you have a consumer subscription, you won't have a choice for long, if at all anymore.
nfredericks•43m ago
Opus 4.5 is still available
grim_io•11m ago
Wow, they hosted it for 6 months. Truly LTS territory :)
Jeremy1026•30m ago
I was trying to figure out earlier today how to get 4.6 to run in Claude Code, as part of the output it included "- Still fully supported — not scheduled for retirement until Feb 2027." Full caveat of, I don't know where it came up with this information, but as others have said, 4.5 is still available today and it is now 5, almost 6 months old.
dallen33•50m ago
I'm still using Sonnet 4.6 with no issues.
risyachka•42m ago
How does this solve the issue? 4.6 will be disabled after one or more release like any other legacy model.
gadflyinyoureye•13m ago
Won't the thing that replaces 4.6 come down in token cost?
iknowstuff•43m ago
Interesting because I already felt like current models spit out too much garbage verbose code that a human would write in a far more terse, beautiful and grokable way
aray07•27m ago
yeah opus 4.7 feels a lot more verbose - i think they changed the system prompt and removed instructions to be terse in its responses
louiereederson•43m ago
LLMs exist on a logaritmhic performance/cost frontier. It's not really clear whether Opus 4.5+ represent a level shift on this frontier or just inhabits place on that curve which delivers higher performance, but at rapidly diminishing returns to inference cost.

To me, it is hard to reject this hypothesis today. The fact that Anthropic is rapidly trying to increase price may betray the fact that their recent lead is at the cost of dramatically higher operating costs. Their gross margins in this past quarter will be an important data point on this.

I think the tendency for graphs of model assessment to display the log of cost/tokens on the x axis (i.e. Artificial Analysis' site) has obscured this dynamic.

snek_case•14m ago
They're also getting closer to IPO and have a growing user base. They can't justify losing a very large number of billions of other people's money in their IPO prospectus.

So there's a push for them to increase revenue per user, which brings us closer to the real cost of running these models.

giwook•3m ago
I agree, and I'm also quite skeptical that Anthropic will be able to remain true to its initial, noble mission statement of acting for the global good once they IPO.

At that point you are beholden to your shareholders and no longer can eschew profit in favor of ethics.

Unfortunately, I think this is the beginning of the end of Anthropic and Modei being a company and CEO you could actually get behind and believe that they were trying to do "the right thing".

It will become an increasingly more cutthroat competition between Anthropic and OpenAI (and perhaps Google eventually if they can close the gap between their frontier models and Claude/GPT) to win market share and revenue.

Perhaps Amodei will eventually leave Anthropic too and start yet another AI startup because of Anthropic's seemingly inevitable prioritization of profit over safety.

louiereederson•12m ago
I meant reference Toby Ord's work here. I think his framing of the performance/cost frontier hasn't gotten enough attention https://www.tobyord.com/writing/hourly-costs-for-ai-agents
paulddraper•5m ago
> The fact that Anthropic is rapidly trying to increase price may betray the fact that their recent lead is at the cost of dramatically higher operating costs.

Or they are just not willing to burn obscene levels of capital like OpenAI.

xd1936•41m ago
And what about with Caveman[1]?

1. https://github.com/juliusbrussee/caveman

Majromax•34m ago
Caveman doesn't and cannot change the tokenizer, so the relative token count differences by input category will remain unchanged.
brokencode•27m ago
Can we have one thread about Claude without people trying to shovel Caveman?

Much of the token usage is in reasoning, exploring, and code generation rather than outputs to the user.

Does making Claude sound like a caveman actually move the needle on costs? I am not sure anymore whether people are serious about this.

To me, caveman sounds bad and is not as easy to understand compared to normal English.

aray07•20m ago
isn’t caveman a joke? why would you use it for real work?
atonse•36m ago
Just yesterday I was happy to have gotten my weekly limit reset [1]. And although I've been doing a lot of mockup work (so a lot of HTML getting written), I think the 1M token stuff is absolutely eating up tokens like CRAZY.

I'm already at 27% of my weekly limit in ONE DAY.

https://news.ycombinator.com/item?id=47799256

aray07•28m ago
yeah similar for me - it uses a bunch more tokens and I haven’t been able to tell the ROI in terms of better instruction following

it seems to hallucinate a bit more (anecdotal)

titaniumtown•9m ago
I had it hallucinate a tool that didn't exist, it was very frustrating!
jabart•21m ago
I'm seeing the opposite. With Opus 4.7 and xhigh, I'm seeing less session usage , it's moving faster, and my weekly usage is not moving that much on a Team Pro account.
jmward01•32m ago
Yeah. I just did a day with 4.7 and I won't be going back for a while. It is just too expensive. On top of the tokenization the thinking seems like it is eating a lot more too.
aray07•29m ago
yeah i am still not clear why there are 5 effort modes now on top of more expensive tokenization
rafram•29m ago
Pretty funny that this article was clearly written by Claude.
markrogersjr•29m ago
4.7 one-shot rate is at least 20-30% higher for me
bcjdjsndon•28m ago
Because those braniacs added 20-30% more system prompt
CodingJeebus•27m ago
The fundamental problem with these frontier model companies is that they're incentivized to create models that burn through more tokens, full stop. It's a tale as old as capitalism: you wake up every day and choose to deliver more value to your customers or your shareholders, you cannot do both simultaneously forever.

People love to throw around "this is the dumbest AI will ever be", but the corollary to that is "this is the most aligned the incentives between model providers and customers will ever be" because we're all just burning VC money for now.

NickC25•13m ago
> but the corollary to that is "this is the most aligned the incentives between model providers and customers will ever be" because we're all just burning VC money for now.

Please say this louder for everyone to hear. We are still at the stage where it is best for Anthropic's product to be as consumer aligned (and cost-friendly) as possible. Anthropic is loosing a lot of money. Both of those things will not be true in the near future.

stefan_•23m ago
I don't know anything about tokens. Anthropic says Pro has "more usage*", Max has 5x or 20x "more usage*" than Pro. The link to "usage limits" says "determines how many messages you can send". Clearly no one is getting billed for tokens.
aray07•4m ago
anthropic’s pricing is all based on token usage

https://platform.claude.com/docs/en/about-claude/pricing

So if you are generating more tokens, you are eating up your usage faster

_pdp_•22m ago
IMHO there is a point where incremental model quality will hit diminishing returns.

It is like comparing an 8K display to a 16K display because at normal viewing distance, the difference is imperceptible, but 16K comes at significant premium.

The same applies to intelligence. Sure, some users might register a meaningful bump, but if 99% can't tell the difference in their day-to-day work, does it matter?

A 20-30% cost increase needs to deliver a proportional leap in perceivable value.

snek_case•17m ago
It probably depends what you're using the models for. If you use them for web search, summarizing web pages, I can imagine there's a plateau and we're probably already hitting it.

For coding though, there is kind of no limit to the complexity of software. The more invariants and potential interactions the model can be aware of, the better presumably. It can handle larger codebases. Probably past the point where humans could work on said codebases unassisted (which brings other potential problems).

aray07•6m ago
yeah thats is my biggest issue - im okay with paying 20-30% more but what is the ROI? i dont see an equivalent improvement in performance. Anthropic hasnt published any data around what these improvements are - just some vague “better instruction following"
mikert89•15m ago
The compute is expensive, what is with this outrage? People just want free tools forever?
rvz•9m ago
> The compute is expensive, what is with this outrage?

Gamblers (vibe-coders) at Anthropic's casino realising that their new slot machine upgrade (Claude Opus) is now taking 20%-30% more credits for every push of the spin button.

Problem is, it advertises how good it is (unverified benchmarks) and has a better random number generator but it still can be rigged (made dumber) by the vendor (Anthropic).

The house (Anthropic) always wins.

> People just want free tools forever?

Using local models are the answer to this if you want to use AI models free forever.

aray07•6m ago
are you okay with paying more for your services without any perceived improvement in the service itself?
sipsi•15m ago
I tried to do my usual test (similar to pelican but a bit more complex) but it ran out of 5 hour limit in 5 minutes. Then after 5 hours I said "go on" and the results were the worst I've ever seen.
qq66•8m ago
This is the backdoor way of raising prices... just inflate the token pricing. It's like ice cream companies shrinking the box instead of raising the price
Yukonv•5m ago
Some broad assumptions are being made that plans give you a precise equivalent to API cost. This is not the case with reverse engineering plan usage showing cached input is free [0]. If you re-run the math removing cached input the usage cost is ~5-34% more. Was the token plan budget increase [1] proportional to account for this? Can’t say with certainty. Those paying API costs though the price hike is real.

[0] https://she-llac.com/claude-limits

[1] https://xcancel.com/bcherny/status/2044839936235553167

encoderer•4m ago
In my “repo os” we have an adversarial agent harness running gpt5.4 for plan and implementation and opus4.6 for review. This was the clear winner in the bake-off when 5.4 came out a couple months ago.

Re-ran the bake-off with 4.7 authoring and… gpt5.4 still clearly winning. Same skills, same prompts, same agents.md.

lacoolj•3m ago
This is probably an adjacent result of this (from anthropic launch post):

> In Claude Code, we’ve raised the default effort level to xhigh for all plans.

Try changing your effort level and see what results you get

curioussquirrel•3m ago
Claude's tokenizers have actually been getting less efficient over the years (I think we're at the third iteration at the least since Sonnet 3.5). And if you prompt the LLM in a language other than English, or if your users prompt it or generate content in other languages, the costs go higher even more. And I mean hundreds of percent more for languages with complex scripts like Tamil or Japanese. If you're interested in the research we did comparing tokenizers of several SOTA models in multiple languages, just hit me up.