frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Kimi K2.6: Advancing Open-Source Coding

https://www.kimi.com/blog/kimi-k2-6
208•meetpateltech•1h ago

Comments

irthomasthomas•1h ago
Beats opus 4.6! They missed claiming the frontier by a few days.
NitpickLawyer•1h ago
While I'm skeptical of any "beats opus" claims (many were said, none turned out to be true), I still think it's insane that we can now run close-to-SotA models locally on ~100k worth of hardware, for a small team, and be 100% sure that the data stays local. Should be a no-brainer for teams that work in areas where privacy matters.
cedws•1h ago
Even the smaller quantized models which can run on consumer hardware pack in an almost unfathomable amount of knowledge. I don't think I expected to be able to run a 'local Google' in my lifetime before the LLM boom.
osti•53m ago
I think this one is only about 600GB VRAM usage, so it could fit on two mac studios with 512GB vram each. That would have costed (albeit no longer available) something like less than 20k.
NitpickLawyer•48m ago
Yeah, but that's personal use at best, not much agentic anything happening on that hardware. Macs are great for small models at small-medium context lengths, but at > 64k (something very common with agentic usage) it struggles and slows down a lot.

The ~100k hardware is suitable for multi-user, small team usage. That's what you'd use for actual work in reasonable timeframes. For personal use, sure macs could work.

zozbot234•26m ago
You could run it with SSD offload, earlier experiments with Kimi 2.5 on M5 hardware had it running at 2 tok/s. K2.6 has a similar amount of total and active parameters.
BoorishBears•1h ago
Opus is clearly a sidegrade meant to help Anthropic manage cost, so I would say they may have it if it actually beats 4.6
irthomasthomas•1h ago
Could be right. I just noticed my feed is absent the usual flood of posts demoing the new hotness on 3D modeling, game design and SVG drawings of animals on vehicles.
pixel_popping•44m ago
It doesn't beat Opus 4.6, no way, don't be fooled by benchmarks.
nickandbro•1h ago
Wow, if the benchmarks checkout with the vibes, this could almost be like a Deepseek moment with Chinese AI now being neck and neck with SOTA US lab made models
motoboi•1h ago
With the previous generation? Yes. With 10T mythos-level models? Not even close.
bestouff•1h ago
There's no public data about Mytho.
maplethorpe•1h ago
That's because it would be too dangerous to release.
nisegami•1h ago
So is my P=NP proof.
cedws•1h ago
My girlfriend goes to a different school, you wouldn't know her.
squarefoot•55m ago
Same for teleport, time travel and warp drive.
amazingamazing•1h ago
The psyop continues. Mythos until it’s released is vaporware. Notice how you can try kimi 2.6. Where is the same for mythos?
fragmede•3m ago
It's been released to "select partners".
ChrisLTD•1h ago
Mythos isn't the current generation, it's literally vaporware.
jollymonATX•1h ago
According to the benchmarks, you are wrong. It is on track and slightly above some sota. Just the benchmarks speaking there, they can be/are gamed by all big model labs including domestic.
irthomasthomas•1h ago
10T? Impossible! They told us the training run was under 10^26 flops.
lbreakjai•29m ago
I've got a 12T model on my machine, built it myself. It's called Mytho. Too dangerous to even release a fact sheet about it. It can hack into the mainframe, enhance ultra-compressed images, grow your hair back, and make people fall in love with you.
sergiotapia•10m ago
mythos is vaporware right now, what are you talking about?
mistercheph•10m ago
Mythos doesnt exist
ai_fry_ur_brain•30m ago
Its not anywhere close, and if it was nobody in the USA would be spending 7 figures on infrastructure for it.

You LLM people all here serious cases of Dunning Kruger

otabdeveloper4•23m ago
> Its not anywhere close

Close to what, and how are you measuring?

> nobody in the USA would be spending 7 figures on infrastructure for it

Au contraire, if AI had a moat it would pay for itself. They're funneling capital into infrastructure because they know it can't.

jstummbillig•2m ago
What?
swingboy•1h ago
Exciting benchmarks if true. What kind of hardware do they typically run these benchmarks on? Apologies if my terminology is off, but I assume they're using an unquantized version that wouldn't run on even the beefiest MacBook?
esafak•1h ago
K2.5 was already pretty decent so I would try this. Starting at $15/month: https://www.kimi.com/membership/pricing

edit: Note that you can run it yourself with sufficient resources, or access it from other providers too: https://openrouter.ai/moonshotai/kimi-k2.6/providers

wg0•1h ago
How are the usage limits compared to Anthropic?
greenavocado•59m ago
Anthropic has the worst usage limits in the industry
andriy_koval•11m ago
gemini is worse imo
deaux•3m ago
You're correct, Gemini chat limits are a joke at their chapest paid tier compared to both Claude and GPT. Especially crazy when you consider Gemini 3 Pro is more than twice as cheap as Opus 4.6 on the API. It's hard to run into pure chat limits on Claude even if you only use Opus on the cheapest tier, whereas with Gemini it's easy to hit.

Not sure about coding usage, Google being weird about these things I could see that quota being separate.

pbowyer•45m ago
What's the privacy/data security like? I can't find that on that page.

Edit: found it.

> We may use your Content to operate, maintain, improve, and develop the Services, to comply with legal obligations, to enforce our policies, and to ensure security. You may opt out of allowing your Content to be used for model improvement and research purposes by contacting us at membership@moonshot.ai. We will honor your choice in accordance with applicable law.

Section 3 of https://www.kimi.com/user/agreement/modelUse?version=v2

pixel_popping•36m ago
You really rely on ToS from Anthropic/OpenAI to know if they use your prompts or not? It's on their servers, why wouldn't they use our data?
gpm•17m ago
> We will honor your choice in accordance with applicable law.

So in other words only if you can point to a local law which requires them to comply with the opt out?

deaux•6m ago
Yup, they train on your inputs and OpenRouter is complicit by claiming that Moonshot's ToS says that they don't. Contacted OpenRouter about this a while ago and was met with silence because it's bad for their business to stop lying about it.
lbreakjai•1h ago
I have a subscription through work, I've been trialing it, so far it looks on par, if not better, than opus.
verdverm•1h ago
https://huggingface.co/moonshotai/Kimi-K2.6

Is this the same model?

Unsloth quants: https://huggingface.co/unsloth/Kimi-K2.6-GGUF

(work in progress, no gguf files yet, header message saying as much)

Balinares•53m ago
Quite curious how well real usage will back the benchmarks, because even if it's only Opus ballpark, open weights Opus ballpark is seismic.
gpm•12m ago
Huh, so the metadata says 1.1 trillion parameters, each 32 or 16 bits.

But the files are only roughly 640GB in size (~10GB * 64 files, slightly less in fact). Shouldn't they be closer to 2.2TB?

johndough•5m ago
The bulk of Kimi-K2.6's parameters are stored with 4 bits per weight, not 16 or 32. There are a few parameters that are stored with higher precision, but they make up only a fraction of the total parameters.
gpm•3m ago
Huh, cool. I guess that makes a lot of sense with all the success the quantization people have been having.

So am I misunderstanding "Tensor type F32 · I32 · BF16" or is it just tagged wrong?

pt9567•1h ago
wow - $0.95 input/$4 output. If its anywhere near opus 4.6 that's incredible.
corlinp•1h ago
This should erase any doubt that AI Labs are making $$$ on API inference.

Kimi 2.5 (which this is based on) is served at $0.44 input / $2 output by a ton of different providers on OpenRouter, 2.6 will certainly be similar.

That's about 11X less than Opus for similar smarts.

Lalabadie•51m ago
Famously, OpenAI and Anthropic are devoted to increasing efficiency before scaling up resource usage.
amazingamazing•46m ago
How does it erase any doubt? You’re implying Chinese things can’t be actually cheaper to produce than American which is laughable
greenavocado•1h ago
I pray the benchmark figures are true so I can stop paying Anthropic after screwing me over this quarter by dumbing down their models, making usage quotas ridiculously small, and demanding KYC paperwork.
jollymonATX•1h ago
Anthropic has done horrible PR and investors should be livid.
greenavocado•1h ago
My theory is they pushed retail off their systems to make room for their new corporate fat cat clients. In which case, they'll do just fine.
deaux•9m ago
> dumbing down their models,

This should be so easy to prove if it were true. Yet there is none of it, just vibes.

Still, your other two points are completely valid. The opaqueness of usage quotas is a scam, within a single month for a single model it can differ by more than 2x. And this indeed has been proven.

elfbargpt•1h ago
I've always been surprised Kimi doesn't get more attention than it does. It's always stood out to me in terms of creativity, quality... has been my favorite model for awhile (but I'm far from an authority)
regularfry•59m ago
Dirt cheap on openrouter for how good it is, too. Really hoping that 2.6 carries on that tradition.
varispeed•42m ago
Maybe because it's a bit of like unleashing a chaos monkey on your codebase? I tried it locally (K2.5 72B) and couldn't get anything useful.
KaoruAoiShiho•41m ago
Huh, that's not a thing?
johndough•34m ago
The parent poster is probably referring to Kimi-Dev-72B¹, which is a much smaller and older model, while people are probably more familiar with the big and fairly powerful 1100B Kimi-K2.5².

[1] https://huggingface.co/moonshotai/Kimi-Dev-72B

[2] https://huggingface.co/moonshotai/Kimi-K2.5

natrys•20m ago
Yes it was good for its time, but 10 months old now which is a long time ago in this space. It was also a fine-tune (albeit a good one) of Qwen-2.5 72B.

I wish they did more smaller models. Kimi Linear doesn't really count, it was more of a proof of concept thing.

culi•41m ago
It's also one of the few models that seem capable of drawing an SVG clock

https://clocks.brianmoore.com/

sigmoid10•22m ago
Is it? In your link it definitely failed to draw the clock.
dryarzeg•8m ago
I'm not really sure how this works, but I stayed on the page for a while, and then it reloaded and all clocks changed. I guess there's either a collection of different clocks generated by models, or maybe they're somehow generated in the real time, but the fact is what you see is not necessarily what I see.
sigmoid10•5m ago
Seems like it regenerates them to reflect the current time. Funny to see how some models (like Kimi and Deepseek) sometimes get it right and other times fail miserably on the level of ancient models like GPT 3.5.
SwellJoe•18m ago
Interesting that the best performers are all Chinese-made models (DeepSeek and Qwen also perform consistently well). I wonder if there's more focus on vision and illustration in their training, or if something else is leading to their clear lead on this one test.
twotwotwo•24m ago
Kagi has it as an option in its Assistant thing, where there is naturally a lot of searching and summarizing results. I've liked its output there and in general when asked for prose that isn't in the list/Markdown-heavy "LLM style." It's hard to do a confident comparison, but it's seemed bold in arranging the output to flow well, even when that took surgery on the original doc(s). Sometimes the surgery's needed e.g. to connect related ideas the inputs treated as separate, or to ensure it really replies to the request instead of just dumping info that's somehow related to it.
Aeolun•3m ago
It’s good, but it’s not quite Claude level. And their API has constant capacity issues.
game_the0ry•1h ago
There is some humor in the fact that china (of all countries) is pioneering possibly the world's most important tech via open source, while we (US) are doing the exact opposite.
osti•55m ago
Maybe open source == communism
darkwater•48m ago
Good ol' Steve "Developers! Developers! Developers!" Ballmer said so a long time ago. What a visionary!
konart•28m ago
But China is not communist event though the rulling party the word in its name.
osti•20m ago
Oh i’m fully aware of that lol
fragmede•6m ago
The Democratic People's Republic of Korea would like a word.
tadfisher•17m ago
Nah, open source means those who do the work own the result. It's supercapitalism.
culi•33m ago
All great technological advancements have come through opening up technology. Just look at your iPhone. GPS, the internet, AI voice assistants, touchscreens, microprocessors, lithium-ion batteries, etc all came from gov't research (I'm counting Bell Labs' gov't mandated monopoly + research funding as gov't) that was opened up for free instead of being locked behind a patent.

Private companies will never open up a technological breakthrough to their competitors. It just doesn't make sense. If you want an entire field to advance, you have to open it up.

nashadelic•29m ago
additional humor is the open in openai
brandensilva•2m ago
We are at the point where uncontrolled capitalism collides with humanity.

I do wonder where we go from here.

nisegami•1h ago
The choice of example task for Long-Horizon Coding is a bit spooky if you squint, since it's nearing the territory of LLMs improving themselves.
Banditoz•1h ago
If the benchmarks are private, how do we reproduce the results? I looked up the Humanity's Last Exam (https://agi.safe.ai/) this model uses and I can't seem to access it.
johndough•17m ago
You can request access here: https://huggingface.co/datasets/cais/hle

The test data is purposely difficult to access to reduce the chance of leaking it into the training dataset.

mariopt•56m ago
Really excited to try this one, I've been using kimi 2.5 for design and it's really good but borderline useless on backend/advanced tasks.

Also discovered that using OpenCode instead of the kimi cli, really hurts the model performance (2.5).

oliver236•55m ago
isnt this better than qwen?
simonw•33m ago
Accessed via OpenRouter, this one decided to wrap the SVG pelican in HTML with controls for the animation speed: https://gisthost.github.io/?ecaad98efe0f747e27bc0e0ebc669e94...

Transcript and HTML here: https://gist.github.com/simonw/ecaad98efe0f747e27bc0e0ebc669...

SwellJoe•24m ago
We got an overachiever, here. Kimi sounds like a teacher's pet kind of name.
FlyingSnake•21m ago
At this point drawing these Pelicans must be in the training data sets.
ffsm8•9m ago
Clearly not.

I mean the prompt was succinct and clear, as always - and it still decided to hallucinate multiple features (animation + controls) beyond the prompt.

It'd also like to point out that to date no drawing was actually good from an actual quality perspective (as in comparative to what a decent designer would throw together)

Theyre always only "good" from the perspective of it being a one shot low effort prompt. Very little content for training purposes.

hn8726•3m ago
Genuine question, what's the goal of posting this on almost every single new model thread here on HN? I may be old and grumpy but to me it got old a while ago, and is closer to a low effort Reddit comment
dmix•31m ago
I'm pretty Kimi is what Cursor uses for their "composer 2" model. Works pretty good as a fallback when Claude runs out, but definitely a downgrade.
cassianoleal•31m ago
If only their API wasn't tied to a Google or phone login...
cmrdporcupine•30m ago
Running it through opencode to their API and... it definitely seems like it's "overthinking" -- watching the thought process, it's been going for pages and pages and pages diagnosing and "thinking" things through... without doing anything. Sitting at 50k+ output tokens used now just going in thought circles, complete analysis paralysis.

Might be a configuration or prompt issue. I guess I'll wait and see, but I can't get use out of this now.

m4rkuskk•28m ago
I have been testing it in my app all morning, and the results line up with 4.6 Sonnet. This is just a "vibe" feeling with no real testing. I'm glad we have some real competition to the "frontier" models.
XCSme•9m ago
A bit weird to be comparing it to Opus-4.5 when 4.7 was released...

EDIT: Wrong comment: they compared it with 4.6, my comment was for the Qwen-3.6 Max release blog post...

wizee•7m ago
They're comparing to Opus 4.6, not 4.5. It was Anthropic's best public model up until last week.
zozbot234•5m ago
Some people would say it's still Anthropic's best public model!
candl•7m ago
Are there any coding plans for this? (aka no token limit, just api call limit). Recently my account failed to be billed for GLM on z.ai and my subscription expired because of this... the pricing for GLM went through the roof in recent months, though...

Qwen3.6-Max-Preview: Smarter, Sharper, Still Evolving

https://qwen.ai/blog?id=qwen3.6-max-preview
245•mfiguiere•3h ago•150 comments

Deezer says 44% of songs uploaded to its platform daily are AI-generated

https://techcrunch.com/2026/04/20/deezer-says-44-of-songs-uploaded-to-its-platform-daily-are-ai-g...
106•FiddlerClamp•1h ago•98 comments

ggsql: A Grammar of Graphics for SQL

https://opensource.posit.co/blog/2026-04-20_ggsql_alpha_release/
197•thomasp85•4h ago•46 comments

GitHub's Fake Star Economy

https://awesomeagents.ai/news/github-fake-stars-investigation/
517•Liriel•8h ago•282 comments

We accepted surveillance as default

https://vivianvoss.net/blog/why-we-accepted-surveillance
81•speckx•52m ago•17 comments

10 years ago, someone wrote a test for servo that included an expiry in 2026

https://mastodon.social/@jdm_/116429380667467307
100•luu•22h ago•60 comments

Bloom (YC P26) Is Hiring

https://www.ycombinator.com/companies/trybloom/jobs
1•RayFitzgerald•26m ago

Kimi K2.6: Advancing Open-Source Coding

https://www.kimi.com/blog/kimi-k2-6
213•meetpateltech•1h ago•94 comments

M 7.4 earthquake – 100 km ENE of Miyako, Japan

https://earthquake.usgs.gov/earthquakes/eventpage/us6000sri7/
191•Someone•7h ago•82 comments

Sauna effect on heart rate

https://tryterra.co/research/sauna-effect-on-heart-rate
247•kyriakosel•3h ago•142 comments

Chernobyl's last wedding: The couple who married as a nuclear disaster unfolded

https://www.bbc.com/news/articles/c0q92lx8q75o
22•1659447091•1d ago•3 comments

All phones sold in the EU to have replaceable batteries from 2027

https://www.theolivepress.es/spain-news/2026/04/20/eu-to-force-replaceable-batteries-in-phones-an...
566•ramonga•3h ago•428 comments

Atlassian enables default data collection to train AI

https://letsdatascience.com/news/atlassian-enables-default-data-collection-to-train-ai-f71343d8
300•kevcampb•5h ago•72 comments

WebUSB Extension for Firefox

https://github.com/ArcaneNibble/awawausb
98•tuananh•5h ago•82 comments

I prompted ChatGPT, Claude, Perplexity, and Gemini and watched my Nginx logs

https://surfacedby.com/blog/nginx-logs-ai-traffic-vs-referral-traffic
90•startages•2h ago•16 comments

Larry Tesler: A Personal History of Modeless Text Editing and Cut/Copy-Paste (2012)

https://dl.acm.org/doi/epdf/10.1145/2212877.2212896
11•aragonite•3d ago•3 comments

OpenClaw isn't fooling me. I remember MS-DOS

https://www.flyingpenguin.com/build-an-openclaw-free-secure-always-on-local-ai-agent/
195•feigewalnuss•9h ago•234 comments

Show HN: Alien – Self-hosting with remote management (written in Rust)

23•alongub•2h ago•6 comments

Ask HN: How to solve the cold start problem for a two-sided marketplace?

86•alegd•3h ago•86 comments

Focused microwaves allow 3D printers to fuse circuits onto almost anything

https://newatlas.com/electronics/meta-nfc-focused-microwaves-circuits/
115•breve•2d ago•21 comments

NSA is using Anthropic's Mythos despite blacklist

https://www.axios.com/2026/04/19/nsa-anthropic-mythos-pentagon
348•Palmik•7h ago•255 comments

Up to 8M Bees Are Living in an Underground Network Beneath This Cemetery

https://www.discovermagazine.com/up-to-8-million-bees-are-living-in-an-underground-network-beneat...
142•janandonly•2d ago•22 comments

What if database branching was easy?

https://xata.io/blog/what-if-database-branching-was-easy
58•tee-es-gee•2d ago•39 comments

Palantir Wants to Reinstate the Draft

https://reason.com/2026/04/20/this-big-tech-firm-wants-to-reinstate-the-draft/
126•tcp_handshaker•1h ago•103 comments

IPC medley: message-queue peeking, io_uring, and bus1

https://lwn.net/Articles/1065490/
27•signa11•3d ago•0 comments

SDF Public Access Unix System

https://sdf.org/?ssh
150•neehao•1d ago•74 comments

I Made the "Next-Level" Camera and I love it

https://thelibre.news/i-made-the-next-level-camera-and-i-love-it/
188•ndr•3d ago•65 comments

I'm never buying another Kindle, and neither should you

https://www.androidauthority.com/amazon-kindle-2026-3657863/
80•mikhael•1h ago•76 comments

Claude Token Counter, now with model comparisons

https://simonwillison.net/2026/Apr/20/claude-token-counts/
191•twapi•16h ago•73 comments

Epicycles All the Way Down (2025)

https://www.strangeloopcanon.com/p/epicycles-all-the-way-down
32•surprisetalk•4d ago•13 comments