You LLM people all here serious cases of Dunning Kruger
Close to what, and how are you measuring?
> nobody in the USA would be spending 7 figures on infrastructure for it
Au contraire, if AI had a moat it would pay for itself. They're funneling capital into infrastructure because they know it can't.
edit: Note that you can run it yourself or access it from other providers too: https://openrouter.ai/moonshotai/kimi-k2.6/providers
Edit: found it.
> We may use your Content to operate, maintain, improve, and develop the Services, to comply with legal obligations, to enforce our policies, and to ensure security. You may opt out of allowing your Content to be used for model improvement and research purposes by contacting us at membership@moonshot.ai. We will honor your choice in accordance with applicable law.
Section 3 of https://www.kimi.com/user/agreement/modelUse?version=v2
Is this the same model?
Unsloth quants: https://huggingface.co/unsloth/Kimi-K2.6-GGUF
(work in progress, no gguf files yet, header message saying as much)
Kimi 2.5 (which this is based on) is served at $0.44 input / $2 output by a ton of different providers on OpenRouter, 2.6 will certainly be similar.
That's about 11X less than Opus for similar smarts.
Private companies will never open up a technological breakthrough to their competitors. It just doesn't make sense. If you want an entire field to advance, you have to open it up.
Also discovered that using OpenCode instead of the kimi cli, really hurts the model performance (2.5).
Transcript and HTML here: https://gist.github.com/simonw/ecaad98efe0f747e27bc0e0ebc669...
Sometimes in one single pass prompt/response can unblock you in issues where Opus ate $100+ in API credits and circled during hours. Other times the response is useless, but it is your responsibility as engineer to discern this.
Verdict (at least for me): use both.
Might be a configuration or prompt issue. I guess I'll wait and see, but I can't get use out of this now.
irthomasthomas•1h ago
NitpickLawyer•1h ago
cedws•55m ago
osti•33m ago
NitpickLawyer•28m ago
The ~100k hardware is suitable for multi-user, small team usage. That's what you'd use for actual work in reasonable timeframes. For personal use, sure macs could work.
zozbot234•6m ago
BoorishBears•1h ago
irthomasthomas•55m ago
pixel_popping•24m ago