We use the smaller models for everything that’s not internal high complexity tasks like coding. Although they would do a good enough of a job there as well, we happily pay the uncharge to get something a little better here.
Anything user facing or when building workflow functionalities like extracting, converting, translating, merging, evaluating, all of these are mini and nano cases at our company.
This is Anthropic's first small reasoner as far as I know.
I am afraid Claude Pro subscription got 3x less usage
Yeah, given how multi-dimensional this stuff is, I assume it's supposed to indicate broad things, closer to marketing than anything objective. Still quite useful.
$5/mt for Haiku 4.5
$10/mt for Sonnet 4.5
$15/mt for Opus 4.5 when it's released.
https://aws.amazon.com/about-aws/whats-new/2024/11/anthropic...
> give me the svg of a pelican riding a bicycle
> I am sorry, I cannot provide SVG code directly. However, I can generate an image of a pelican riding a bicycle for you!
> ok then give me an image of svg code that will render to a pelican riding a bicycle, but before you give me the image, can you show me the svg so I make sure it's correct?
> Of course. Here is the SVG code...
(it was this in the end: https://tinyurl.com/zpt83vs9)
https://simonwillison.net/2025/Jun/6/six-months-in-llms/
https://simonwillison.net/tags/pelican-riding-a-bicycle/
Full verbose documentation on the methodology: https://news.ycombinator.com/item?id=44217852
https://docs.claude.com/en/docs/build-with-claude/context-wi...
Branding is the true issue that Anthropic has though. Haiku 4.5 may (not saying it is, far to early to tell) be roughly equivalent in code output quality compared to Sonnet 4, which would serve a lot users amazingly well, but by virtue of the connotations smaller models have, alongside recent performance degradations making users more suspicious than beforehand, getting these do adopt Haiku 4.5 over Sonnet 4.5 even will be challenging. I'd love to know whether Haiku 3, 3.5 and 4.5 are roughly in the same ballpark in terms of parameters and course, nerdy old me would like that to be public information for all models, but in fairness to companies, many would just go for the largest model thinking it serves all use cases best. GPT-5 to me is still most impressive because of its pricing relative to performance and Haiku may end up similar, though with far less adoption. Everyone believes their task requires no less than Opus it seems after all.
For reference:
Haiku 3: I $0.25/M, O $1.25/M
Haiku 4.5: I $1.00/M, O $5.00/M
GPT-5: I $1.25/M, O $10.00/M
GPT-5-mini: I $0.25/M, O $2.00/M
GPT-5-nano: I $0.05/M, O $0.40/M
GLM-4.6: I $0.60/M, O $2.20/M
Where it doesn't shine much is on very large coding task. but it is a phenomenal model for small coding tasks and the speed improvement is much welcome
Haiku 4.5: I $1.00/M, O $5.00/M
Grok Code: I $0.2/M, O $1.5/M
Usually I'm using GPT-5-mini for that task. Haiku 4.5 runs 3x faster with roughly comparable results (I slightly prefer the GPT-5-mini output but may have just accustomed to it).
I agree that the models from OpenAI and Google have much slower responses than the models from Anthropic. That makes a lot of them not practical for me.
Haiku 4.5 is very good but still seems to be adding a second of latency.
I expect I will be a lot more productive using this instead of claude 4.5 which has been my daily driver LLM since it came out.
What's the advantage of using haiku for me?
is it just faster?
doesn't work
minimaxir•2h ago
Given that Sonnet is still a popular model for coding despite the much higher cost, I expect Haiku will get traction if the quality is as good as this post claims.
Bolwin•2h ago
This could be massive.
logicchains•2h ago
HarHarVeryFunny•1h ago
I suppose it depends on how you are using it, but for coding isn't output cost more relevant than input - requirements in, code out ?
criemen•1h ago
Depends on what you're doing, but for modifying an existing project (rather than greenfield), input tokens >> output tokens in my experience.
logicchains•1h ago
Tiberium•1h ago
https://docs.claude.com/en/docs/build-with-claude/prompt-cac...
https://ai.google.dev/gemini-api/docs/caching
https://platform.openai.com/docs/guides/prompt-caching
https://docs.x.ai/docs/models#cached-prompt-tokens
tempusalaria•1h ago
simonw•1h ago
criemen•1h ago
If I'm missing something about how inference works that explains why there is still a cost for cached tokens, please let me know!
simonw•1h ago
criemen•1h ago
nthypes•55m ago
criemen•41m ago
minimaxir•38m ago
dotancohen•15m ago
simonw•2h ago
I was hoping Anthropic would introduce something price-competitive with the cheaper models from OpenAI and Gemini, which get as low as $0.05/$0.40 (GPT-5-Nano) and $0.075/$0.30 (Gemini 2.0 Flash Lite).
diwank•1h ago
odie5533•1h ago
dr_dshiv•1h ago
odie5533•12m ago
rudedogg•47m ago
justinbaker84•29m ago
I spend way to much time waiting for the cutting edge models to return a response. 73% on SWE Bench is plenty good enough for me.