DeepSeek V4–almost on the frontier, a fraction of the price

https://simonwillison.net/2026/Apr/24/deepseek-v4/

105•indigodaddy•17h ago

Comments

jdasdf•10h ago

I've been using v4 pro for the past few days and honestly in terms of quality it seems more or less on par with open AIs 5.4 or opus 4.6 (i havent tried 4.7)

To be clear, i'm not doing state of the art stuff. I mostly used it for frontend development since i'm not great at that and just need a decent looking prototype.

But for my purposes it's a perfectly good model, and the price is decent.

I can't wait for open model small enough for me to run locally come out though. I hate having to rely on someone elses machines (and getting all my data exfiltrated that way)

enochthered•9h ago

Thanks for sharing your experience, I’m looking to try it out.

Which provider are you using for inference? Opencode or the DeepSeek api?

teruakohatu•10h ago

The pelican is really getting old as an a standalone evaluation metric. By now they are certainly going to be in training set if not explicitly tuned to produce it for the press on HN alone.

Keep the pelican but isn’t it time to add something else more novel that all current and past models struggle with?

justinclift•9h ago

Relevant: https://news.ycombinator.com/item?id=47839493

caseyf7•4h ago

It also seems like all of the models have converged on very similar images.

KronisLV•9h ago

I'm currently paying for Anthropic's Max subscription (the 100 USD one) and I quite often hit or approach the 5 hour limits, but usually get to around 60-80% of the weekly limits before they reset (Opus 4.7 with high thinking for everything, unless CC decides to spawn sub-agents with Haiku or something).

Those tokens are heavily subsidized, but DeepSeek's API pricing is looking really good. For example, with an agentic coding setup (roughly 85% input, 15% output and around 90% cache reads) I'd get around 150M tokens per month for the same 100 USD. Even at more output tokens and worse cache performance, it'd still most likely be upwards of 100M.

try-working•7h ago

Someone on Twitter got >200M tokens for around $10 at the current pricing level

rvz•5h ago

So it begins.

aitchnyu•1h ago

What would be the non-subsidized price for a V4 api? Can it be priced 3x cheaper than bigger models? In Openrouter, this 1600B param model costs 0.4$. Whereas Kimi 2.6, 1000B params is 0.7; GLM 5.1, 754B params is 1.0$.

alasano•8h ago

I tweeted about some implementation and review runs that used V4 Pro.

Even without the currently discounted pricing, the value is incredible.

It takes about twice as long to finish code reviews given an identical context compared to opus 4.7/gpt 5.5 but at 1/10 the cost of less, there's just no comparison.

https://twitter.com/aljosa/status/2049176528638902555

wg0•8h ago

Deepseek v4 Pro feels like Claude Opus 4.6 in it's personality but here's what I did find out about costs:

I did cut loose Deepseek v4 on a decent sized Typescript codebase and asked it to only focus on a single endpoint and go in depth on it layer by layer (API, DTOs, service, database models) and form a complete picture of types involved and introduced and ensure no adhoc types are being introduced.

It developed a very brief but very to the point summary of types being introduced and which of them were refunded etc.

Then I asked it to simplify it all.

It obviously went through lots of files in both prompts but total cost? Just $0.09 for the Pro version.

On Claude Opus I think (from past experience before price hikes) these two prompts alone would have burned somewhere between $9 to $13 easily with not much benefit.

Note - I didn't use Open router rather used the Deepseek API directly because Open router itself was being rate limited by Deep seek.

ithkuil•2h ago

Even taking into account the fact that they are billing at 75% discount it's still quite cheaper

amelius•57m ago

Aren't they all billing at discount?

stavros•30m ago

Anthropic's and OpenAI's costs seem to include a fairly ok margin, from the very fourth hand info I have.

baldai•1h ago

Only similarity it has to Opus 4.6 is the 4 in the name. I do not understand these dishonest comparisons. OOS models are vool, cheap and promising for a future -- but why are we pretending they are better than they are?

gmerc•38m ago

Speak for yourself. I found switching from Opus 4.7 to be completely painless and in fact, due to the reliability of Anthropic’s API, less of a friction despite slower response times. Zero issues on a large mono repro

myaccountonhn•4h ago

I recently switched from Claude to Opencode Go + pi.dev. It has Deepseek v4 pro along with Kimi K2.6, and it's performing quite well for basic coding, without hitting any limits.

deaux•3h ago

I'm surprised that people here don't care at all about these models openly training on your data, especially if you use them straight from the model developer. Whereas things like "GitHub now automatically opts everyone into using their code for model training" get hundreds of justifiably angry comments, I never see this brought up anymore on posts like these talking about using Chinese models through OpenRouter. This might be explained by "well they're different people", but the difference is very stark for that to be the whole explanation.

pheggs•3h ago

I am personally okay helping them as long as they publish the models and dont keep them closed. And I dont trust the settings where providers say they wont train on it.

prism56•2h ago

If the data is opensource on github, then in my opinion it should be fair game.

ozgrakkurt•1h ago

IMO this is unfair for GPL or similarly licensed code.

Seems ok for MIT like licensed code though

ForHackernews•51m ago

It's totally fair to use GPL code, it just means all the models built by Anthropic, OpenAI, etc. using GPL-licensed source are themselves bound by the GPL. Plus, any works created downstream using those AI tools.

We're on the verge of a golden age of software as soon as someone finds a court with courage.

duskdozer•36m ago

Ah, you have much more faith in the legal system than I do. It's nice to dream, though.

notrealyme123•1h ago

Things being public should not be enough. just because someone leaked your medical information to the public via a data breach should not make it fair game. There should be some rules.

prism56•44m ago

I feel that's a flase dichotomy. The code visible on github is freely available for anyone to read and learn from.

prism56•43m ago

I feel that's a false dichotomy. The code on github is freely available for people to read and learn from, leaked medical data isn't.

antiloper•2h ago

AWS Bedrock has DeepSeek models running on their infrastructure. That should be enough to prevent training on user data (there's a markup compared to DeepSeek's pricing though).

And unfortunately AWS doesn't have prepaid billing, so you can't just give the internet access to your API key without getting FinDDoS'd.

deaux•2h ago

The latest one available for serverless inference looks to be from 8 months (Deepseek v3.1), which is an eternity and far behind.

gmerc•1h ago

Because they give it away for free and offer APIs at very acceptable rates. Not that hard to figure out, Robin Hood stealing our data tax back comes to mind.

deaux•1h ago

GitHub is free.

notrealyme123•1h ago

User publishes to github => Copilot trains with GitHub data => MS Sells copilot => User workes for Microsoft (in the sense of giving it's labour for MS to make money)

User publishes to github => Deepseek trains with GitHub data => Deepseek gives model away for free => User did not work for Deepseek (in the sense of giving it's labour for Deepseek to make money)

arikrahman•43m ago

Exactly, it's intuitively different.

raincole•1h ago

Two factors. First is anti-americanism (or at least anti-american-capitalism).

But the more important one is the social contract. Github came far before LLM era. The branding around it is being the storage of open source projects and many users want to it stay away from AI hype. You won't expect LLM providers to stay away from AI hype (duh) so it's less an issue for them.

duskdozer•40m ago

What do you mean specifically? Data passed through OpenRouter? Or that they too indiscriminately ingest data all over the web? If the former, I assume it's just that anyone still using them just doesn't care where the data comes from. If the latter, well, it seems like every day there's some news on some new model from somewhere, and it takes dedication to complain every time. There's also the factor that I believe DeepSeek is more open with the model, while others keep it entirely proprietary, which feels fairer and (personally) is also less offensive.

dbeley•37m ago

The cool thing about open-weights model is that you are free to use alternative providers that won't phone home to the original model creators.

I see 6 alternative providers listed on Openrouter for DeepSeek V4 Pro for example.

holysantamaria•3h ago

From the pricing page of deepseek:

(3) The deepseek-v4-pro model is currently offered at a 75% discount, extended until 2026/05/31 15:59 UTC.

Was this taken into account when reviewing the model?

cyber_kinetist•2h ago

Yeah even the Chinese open models have a problem that inference costs for these aren't that cheap. The only way out for the AI bubble collapse is simply more efficient hardware at lower costs and infrastructure setup downtime.

gmerc•1h ago

It’s just an introduction price to speed up adoption for the rest of the month, hardly worth mentioning compared to subsidized coding plans.

We know DS runs profitable, they also indicate in their paper they expect prices to drop as they get access to the next gen Huawei cards.

gmerc•1h ago

obviously everyone subsidizes for user acquisition - after all people need to be coaxed to test your model, claude code subscriptions come to me one.

DeepSeek pro is 65/86% cheaper (i/o tokens) in subsidized pro vs pro and 91/97% cheaper with current subsidies.

Flash vs Sonnet 4.6 is 95/98%

taffydavid•2h ago

I tried deepseek v4 through open code at the weekend. I'm a daily Claude/Claude code user.

I tried to build something simple and while it got the job done the thinking displayed did not fill me with confidence. It was pages and pages of "actually no", "hang on", "wait that makes no sense". It was like the model was having a breakdown.

Bear in mind open code was also new to me so I could be just seeing thinking where I usually don't

atoav•2h ago

> Bear in mind open code was also new to me so I could be just seeing thinking where I usually don't

Well there's your problem.

Edit: I remember seeing similar things with ChatGPT or Codex, although I can't remember in which context.

Jtarii•1h ago

I see similar things using GLM 5.1 in pi.

I had to turn off thinking traces because it was just giving me anxiety looking at it.

raincole•57m ago

The V3/R1 time and now are in such contrast. V3/R1 were hyped hard and barely usable for coding. V4 is much less hyped but (anecdotally) it has completely demolished all the Flash/Lite/Spark models.

zozbot234•46m ago

Huh? R1 was one of the earliest openly available MoE and reasoning models, that's definitely not "hype". People tried to do reasoning before by asking the model to "think it through step by step" but that was a hack. The later V3.1 and V3.2 releases AIUI unified reasoning/non-reasoning use under a single model.

chaosprint•52m ago

I doubt if those models already knew this pelican test...

Why does it take so long to release black fan versions?

Why are there both TMP and TEMP environment variables? (2015)

Ti-84 Evo

Show HN: DAC – open-source dashboard as code tool for agents and humans

SKILL.make: Makefile Styled Skill File

Show HN: Browser-based light pollution simulator using real photometric data

How fast is a macOS VM, and how small could it be?

Show HN: Filling PDF forms with AI using client-side tool calling

Artemis II Photo Timeline

A Gopher Meets a Crab

The USB Situation

New research suggests people can communicate and practice skills while dreaming

Ask.com has closed

To Restore an Island Paradise, Add Fungi

K3k: Kubernetes in Kubernetes

LFM2-24B-A2B: Scaling Up the LFM2 Architecture

CollectWise (YC F24) Is Hiring

Shadcn/UI: A set of beautifully designed components that you can customize

Bitmap and tilemap generation from a single example

Show HN: Large Scale Article Extract of Newspapers 1730s-1960s

I'm Peter Roberts, immigration attorney who does work for YC and startups. AMA

Show HN: SimDrive – a browser racing game with your phone as the controller:D

Lib0xc: A set of C standard library-adjacent APIs for safer systems programming

Ask HN: Who is hiring? (May 2026)

Show HN: WhatCable, a tiny menu bar app for inspecting USB-C cables

A report on burnout in open source software communities (2025) [pdf]

Eka’s robotic claw feels like we're approaching a ChatGPT moment

Show HN: Stop playing my matchstick puzzles, start building your own in seconds

Direct electrochemical black coffee quality appraisal using cyclic voltammetry

The smelly baby problem

Why does it take so long to release black fan versions?

Why are there both TMP and TEMP environment variables? (2015)

Ti-84 Evo

Show HN: DAC – open-source dashboard as code tool for agents and humans

SKILL.make: Makefile Styled Skill File

Show HN: Browser-based light pollution simulator using real photometric data

How fast is a macOS VM, and how small could it be?

Show HN: Filling PDF forms with AI using client-side tool calling

Artemis II Photo Timeline

A Gopher Meets a Crab

The USB Situation

New research suggests people can communicate and practice skills while dreaming

Ask.com has closed

To Restore an Island Paradise, Add Fungi

K3k: Kubernetes in Kubernetes

LFM2-24B-A2B: Scaling Up the LFM2 Architecture

CollectWise (YC F24) Is Hiring

Shadcn/UI: A set of beautifully designed components that you can customize

Bitmap and tilemap generation from a single example

Show HN: Large Scale Article Extract of Newspapers 1730s-1960s

I'm Peter Roberts, immigration attorney who does work for YC and startups. AMA

Show HN: SimDrive – a browser racing game with your phone as the controller:D

Lib0xc: A set of C standard library-adjacent APIs for safer systems programming

Ask HN: Who is hiring? (May 2026)

Show HN: WhatCable, a tiny menu bar app for inspecting USB-C cables

A report on burnout in open source software communities (2025) [pdf]

Eka’s robotic claw feels like we're approaching a ChatGPT moment

Show HN: Stop playing my matchstick puzzles, start building your own in seconds

Direct electrochemical black coffee quality appraisal using cyclic voltammetry

The smelly baby problem

DeepSeek V4–almost on the frontier, a fraction of the price

Comments