frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Kimi K2.6: Advancing Open-Source Coding

https://www.kimi.com/blog/kimi-k2-6
167•meetpateltech•1h ago

Comments

irthomasthomas•1h ago
Beats opus 4.6! They missed claiming the frontier by a few days.
NitpickLawyer•1h ago
While I'm skeptical of any "beats opus" claims (many were said, none turned out to be true), I still think it's insane that we can now run close-to-SotA models locally on ~100k worth of hardware, for a small team, and be 100% sure that the data stays local. Should be a no-brainer for teams that work in areas where privacy matters.
cedws•55m ago
Even the smaller quantized models which can run on consumer hardware pack in an almost unfathomable amount of knowledge. I don't think I expected to be able to run a 'local Google' in my lifetime before the LLM boom.
osti•33m ago
I think this one is only about 600GB VRAM usage, so it could fit on two mac studios with 512GB vram each. That would have costed (albeit no longer available) something like less than 20k.
NitpickLawyer•28m ago
Yeah, but that's personal use at best, not much agentic anything happening on that hardware. Macs are great for small models at small-medium context lengths, but at > 64k (something very common with agentic usage) it struggles and slows down a lot.

The ~100k hardware is suitable for multi-user, small team usage. That's what you'd use for actual work in reasonable timeframes. For personal use, sure macs could work.

zozbot234•6m ago
You could run it with SSD offload, earlier experiments with Kimi 2.5 on M5 hardware had it running at 2 tok/s. K2.6 has a similar amount of total and active parameters.
BoorishBears•1h ago
Opus is clearly a sidegrade meant to help Anthropic manage cost, so I would say they may have it if it actually beats 4.6
irthomasthomas•55m ago
Could be right. I just noticed my feed is absent the usual flood of posts demoing the new hotness on 3D modeling, game design and SVG drawings of animals on vehicles.
pixel_popping•24m ago
It doesn't beat Opus 4.6, no way, don't be fooled by benchmarks.
nickandbro•1h ago
Wow, if the benchmarks checkout with the vibes, this could almost be like a Deepseek moment with Chinese AI now being neck and neck with SOTA US lab made models
motoboi•59m ago
With the previous generation? Yes. With 10T mythos-level models? Not even close.
bestouff•56m ago
There's no public data about Mytho.
maplethorpe•52m ago
That's because it would be too dangerous to release.
nisegami•43m ago
So is my P=NP proof.
cedws•43m ago
My girlfriend goes to a different school, you wouldn't know her.
squarefoot•35m ago
Same for teleport, time travel and warp drive.
amazingamazing•52m ago
The psyop continues. Mythos until it’s released is vaporware. Notice how you can try kimi 2.6. Where is the same for mythos?
ChrisLTD•51m ago
Mythos isn't the current generation, it's literally vaporware.
jollymonATX•47m ago
According to the benchmarks, you are wrong. It is on track and slightly above some sota. Just the benchmarks speaking there, they can be/are gamed by all big model labs including domestic.
irthomasthomas•44m ago
10T? Impossible! They told us the training run was under 10^26 flops.
lbreakjai•9m ago
I've got a 12T model on my machine, built it myself. It's called Mytho. Too dangerous to even release a fact sheet about it. It can hack into the mainframe, enhance ultra-compressed images, grow your hair back, and make people fall in love with you.
ai_fry_ur_brain•10m ago
Its not anywhere close, and if it was nobody in the USA would be spending 7 figures on infrastructure for it.

You LLM people all here serious cases of Dunning Kruger

otabdeveloper4•3m ago
> Its not anywhere close

Close to what, and how are you measuring?

> nobody in the USA would be spending 7 figures on infrastructure for it

Au contraire, if AI had a moat it would pay for itself. They're funneling capital into infrastructure because they know it can't.

swingboy•1h ago
Exciting benchmarks if true. What kind of hardware do they typically run these benchmarks on? Apologies if my terminology is off, but I assume they're using an unquantized version that wouldn't run on even the beefiest MacBook?
esafak•59m ago
K2.5 was already pretty decent so I would try this. Starting at $15/month: https://www.kimi.com/membership/pricing

edit: Note that you can run it yourself or access it from other providers too: https://openrouter.ai/moonshotai/kimi-k2.6/providers

wg0•47m ago
How are the usage limits compared to Anthropic?
greenavocado•39m ago
Anthropic has the worst usage limits in the industry
pbowyer•25m ago
What's the privacy/data security like? I can't find that on that page.

Edit: found it.

> We may use your Content to operate, maintain, improve, and develop the Services, to comply with legal obligations, to enforce our policies, and to ensure security. You may opt out of allowing your Content to be used for model improvement and research purposes by contacting us at membership@moonshot.ai. We will honor your choice in accordance with applicable law.

Section 3 of https://www.kimi.com/user/agreement/modelUse?version=v2

pixel_popping•16m ago
You really rely on ToS from Anthropic/OpenAI to know if they use your prompts or not? It's on their servers, why wouldn't they use our data?
lbreakjai•51m ago
I have a subscription through work, I've been trialing it, so far it looks on par, if not better, than opus.
verdverm•49m ago
https://huggingface.co/moonshotai/Kimi-K2.6

Is this the same model?

Unsloth quants: https://huggingface.co/unsloth/Kimi-K2.6-GGUF

(work in progress, no gguf files yet, header message saying as much)

Balinares•33m ago
Quite curious how well real usage will back the benchmarks, because even if it's only Opus ballpark, open weights Opus ballpark is seismic.
pt9567•49m ago
wow - $0.95 input/$4 output. If its anywhere near opus 4.6 that's incredible.
corlinp•40m ago
This should erase any doubt that AI Labs are making $$$ on API inference.

Kimi 2.5 (which this is based on) is served at $0.44 input / $2 output by a ton of different providers on OpenRouter, 2.6 will certainly be similar.

That's about 11X less than Opus for similar smarts.

Lalabadie•30m ago
Famously, OpenAI and Anthropic are devoted to increasing efficiency before scaling up resource usage.
amazingamazing•26m ago
How does it erase any doubt? You’re implying Chinese things can’t be actually cheaper to produce than American which is laughable
greenavocado•48m ago
I pray the benchmark figures are true so I can stop paying Anthropic after screwing me over this quarter by dumbing down their models, making usage quotas ridiculously small, and demanding KYC paperwork.
jollymonATX•46m ago
Anthropic has done horrible PR and investors should be livid.
greenavocado•45m ago
My theory is they pushed retail off their systems to make room for their new corporate fat cat clients. In which case, they'll do just fine.
elfbargpt•43m ago
I've always been surprised Kimi doesn't get more attention than it does. It's always stood out to me in terms of creativity, quality... has been my favorite model for awhile
regularfry•39m ago
Dirt cheap on openrouter for how good it is, too. Really hoping that 2.6 carries on that tradition.
varispeed•22m ago
Maybe because it's a bit of like unleashing a chaos monkey on your codebase? I tried it locally (K2.5 72B) and couldn't get anything useful.
KaoruAoiShiho•21m ago
Huh, that's not a thing?
johndough•14m ago
The parent poster is probably referring to Kimi-Dev-72B¹, which is a much smaller and older model, while people are probably more familiar with the big and fairly powerful 1100B Kimi-K2.5².

[1] https://huggingface.co/moonshotai/Kimi-Dev-72B

[2] https://huggingface.co/moonshotai/Kimi-K2.5

culi•21m ago
It's also one of the few models that seem capable of drawing an SVG clock

https://clocks.brianmoore.com/

sigmoid10•2m ago
Is it? In your link it definitely failed to draw the clock.
twotwotwo•4m ago
Kagi has it as an option in its Assistant thing, where there is naturally a lot of searching and summarizing results. I've liked its output there and in general when asked for prose that isn't in the list/Markdown-heavy "LLM style." It's hard to do a confident comparison, but it's seemed bold in arranging the output to flow well, even when that took surgery on the original doc(s). Sometimes the surgery's needed e.g. to connect related ideas the inputs treated as separate, or to ensure it really replies to the request instead of just dumping info that's somehow related to it.
game_the0ry•43m ago
There is some humor in the fact that china (of all countries) is pioneering possibly the world's most important tech via open source, while we (US) are doing the exact opposite.
osti•35m ago
Maybe open source == communism
darkwater•28m ago
Good ol' Steve "Developers! Developers! Developers!" Ballmer said so a long time ago. What a visionary!
konart•8m ago
But China is not communist event though the rulling party the word in its name.
culi•13m ago
All great technological advancements have come through opening up technology. Just look at your iPhone. GPS, the internet, AI voice assistants, touchscreens, microprocessors, lithium-ion batteries, etc all came from gov't research (I'm counting Bell Labs' gov't mandated monopoly + research funding as gov't) that was opened up for free instead of being locked behind a patent.

Private companies will never open up a technological breakthrough to their competitors. It just doesn't make sense. If you want an entire field to advance, you have to open it up.

nashadelic•9m ago
additional humor is the open in openai
nisegami•42m ago
The choice of example task for Long-Horizon Coding is a bit spooky if you squint, since it's nearing the territory of LLMs improving themselves.
Banditoz•41m ago
If the benchmarks are private, how do we reproduce the results? I looked up the Humanity's Last Exam (https://agi.safe.ai/) this model uses and I can't seem to access it.
mariopt•36m ago
Really excited to try this one, I've been using kimi 2.5 for design and it's really good but borderline useless on backend/advanced tasks.

Also discovered that using OpenCode instead of the kimi cli, really hurts the model performance (2.5).

oliver236•35m ago
isnt this better than qwen?
simonw•13m ago
Accessed via OpenRouter, this one decided to wrap the SVG pelican in HTML with controls for the animation speed: https://gisthost.github.io/?ecaad98efe0f747e27bc0e0ebc669e94...

Transcript and HTML here: https://gist.github.com/simonw/ecaad98efe0f747e27bc0e0ebc669...

SwellJoe•4m ago
We got an overachiever, here. Kimi sounds like a teacher's pet kind of name.
dmix•11m ago
I'm pretty Kimi is what Cursor uses for their "composer 2" model. Works pretty good as a fallback when Claude runs out, but definitely a downgrade.
fintechie•11m ago
Gonna give this one a go... the previous 2.5 model is used for Cursor's Composer 2 Fast. After real world tasks during a few weeks I have seen that it can be very dumb or it can be very good (better than Opus 4.7) depending on the problem you throw at it.

Sometimes in one single pass prompt/response can unblock you in issues where Opus ate $100+ in API credits and circled during hours. Other times the response is useless, but it is your responsibility as engineer to discern this.

Verdict (at least for me): use both.

cassianoleal•11m ago
If only their API wasn't tied to a Google or phone login...
cmrdporcupine•10m ago
Running it through opencode to their API and... it definitely seems like it's "overthinking" -- watching the thought process, it's been going for pages and pages and pages diagnosing and "thinking" things through... without doing anything. Sitting at 50k+ output tokens used now just going in thought circles, complete analysis paralysis.

Might be a configuration or prompt issue. I guess I'll wait and see, but I can't get use out of this now.

m4rkuskk•8m ago
I have been testing it in my app all morning, and the results line up with 4.6 Sonnet. This is just a "vibe" feeling with no real testing. I'm glad we have some real competition to the "frontier" models.

Hack Monty, Win $5k: Inside PydanticAI's Challenge

https://pydantic.dev/articles/hack-monty
1•v-mdev•46s ago•0 comments

Laz's Wolfenstein 3D Page

http://lazrojas.com/wolf3d/
1•justsomehnguy•1m ago•0 comments

Colorado River disappeared record for 5M years: now we know where it was

https://phys.org/news/2026-04-colorado-river-geological-million-years.html
1•wglb•1m ago•1 comments

Code Is the New Assembly

https://abhyrama.com/code-is-the-new-assembly/
1•flyaway123•1m ago•0 comments

The Download: murderous 'mirror' bacteria, and Chinese workers fighting AI doub

https://www.technologyreview.com/2026/04/20/1136154/the-download-murderous-mirror-bacteria-chines...
1•joozio•2m ago•0 comments

OpenData Timeseries: Prometheus-compatible metrics on object storage

https://www.opendata.dev/blog/introducing-timeseries/
1•hachikuji•2m ago•0 comments

The AI engineering stack we built internally – on the platform we ship

https://blog.cloudflare.com/internal-ai-engineering-stack/
1•mavelikara•2m ago•0 comments

Show HN: My Hyperliquid Terminal

https://www.aulico.com
1•kiosktryer•2m ago•0 comments

H.R.8250 – Parents Decide Act

https://www.congress.gov/bill/119th-congress/house-bill/8250/text
1•philips•3m ago•1 comments

Cute Matrix Transpose

https://leimao.github.io/article/CuTe-Matrix-Transpose/
1•eigenBasis•3m ago•0 comments

Show HN: Noise.widgita.xyz – a zero-back end noise map for anywhere in the world

https://noise.widgita.xyz/
1•fairlight1337•6m ago•0 comments

Show HN: Hora – A Native SwiftUI Google Calendar Client for macOS

https://horacal.app/
1•szamski•9m ago•0 comments

Contact Lens Uses Microfluidics to Monitor and Treat Glaucoma

https://spectrum.ieee.org/smart-contact-lens-glaucoma-microfluidics
2•zdw•9m ago•0 comments

The Theory of Interstellar Trade [pdf]

https://www.princeton.edu/~pkrugman/interstellar.pdf
1•AFF87•9m ago•0 comments

Show HN: Reproducible benchmark – OpenAI charges 1.5x-3.3x more for non-English

https://github.com/vfalbor/llm-language-token-tax
1•vfalbor•11m ago•0 comments

I Like the Web They Want

https://vasilis.nl/nerd/2026/i-like-the-web-they-want/
3•speckx•12m ago•0 comments

Labor Automation Forecasting Hub on Metaculus Measures Impact of AI on Labor

https://www.metaculus.com/labor-hub/
1•postreal•14m ago•1 comments

DeWitt Clauses

https://danluu.com/anon-benchmark/
3•thomasahle•15m ago•1 comments

Show HN: Enlist AI: A tool that turns any job description into a study plan

https://enlistai.vercel.app
1•lilprince1218•17m ago•0 comments

Efficiently Transfer Files to LibreOffice Calc: A Step-by-Step Guide

https://shunspirit.com/article/how-to-transfer-files-to-libre-office-calc
1•rolph•21m ago•0 comments

Transitioning from Corporate to Open Source at 23 y.o

https://www.tharropoulos.dev/blog/transitioning-from-corporate-to-open-source/
2•tharropoulos•22m ago•0 comments

What we once had (at the height of the XMPP era of the Internet) (2023)

https://www.kirsle.net/what-we-once-had-at-the-height-of-the-xmpp-era-of-the-internet
3•birdculture•23m ago•0 comments

Agent-consistency – a Python consistency layer for multi-agent workflows

https://github.com/karimbaidar/agent-consistency-refund-demo
1•baidarkarim•23m ago•0 comments

Modern Board Games: and why you should play them (2022)

https://boardgamegeek.com/blog/10755/blogpost/124992/modern-board-games-and-why-you-should-play-them
1•maayank•25m ago•0 comments

Scaling Claude beyond individual workflows – lessons from our team

https://ninkovic.dev/blog/2026/scaling-claude-beyond-individual-workflows
1•nemwiz•25m ago•0 comments

Language Modeling Without Neural Networks

https://nathan.rs/posts/unbounded-n-gram/
2•nathan-barry•25m ago•0 comments

Power tools got worse on purpose

https://www.worseonpurpose.com/p/your-power-tools-got-worse-on-purpose
1•longhaul•27m ago•0 comments

A New Chapter for Ruby Central

https://rubycentral.org/news/a-new-chapter-for-ruby-central/
1•campuscodi•29m ago•0 comments

Quantum Computers Are Not a Threat to 128-Bit Symmetric Keys

https://words.filippo.io/128-bits/
3•hasheddan•29m ago•0 comments

Show HN: Open-source alternative HN front page with point highlights and search

https://github.com/pretzelai/hackernewsx
1•ramonga•31m ago•1 comments