frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Kimi K2.5 Technical Report [pdf]

https://github.com/MoonshotAI/Kimi-K2.5/blob/master/tech_report.pdf
81•vinhnx•3h ago

Comments

zeroxfe•1h ago
I've been using this model (as a coding agent) for the past few days, and it's the first time I've felt that an open source model really competes with the big labs. So far it's been able to handle most things I've thrown at it. I'm almost hesitant to say that this is as good as Opus.
thesurlydev•1h ago
Can you share how you're running it?
gigatexal•1h ago
Yeah I too am curious. Because Claude code is so good and the ecosystem so just it works that I’m Willing to pay them.
epolanski•27m ago
You can plug another model in place of Anthropic ones in Claude Code.
zeroxfe•23m ago
That tends to work quite poorly because Claude Code does not use standard completions APIs. I tried it with Kimi, using litellm[proxy], and it failed in too many places.
Imustaskforhelp•10m ago
I tried kimi k2.5 and first I didn't really like it. I was critical of it but then I started liking it. Also, the model has kind of replaced how I use chatgpt too & I really love kimi 2.5 the most right now (although gemini models come close too)

To be honest, I do feel like kimi k2.5 is the best open source model. It's not the best model itself right now tho but its really price performant and for many use cases might be nice depending on it.

It might not be the completely SOTA that people say but it comes pretty close and its open source and I trust the open source part because I feel like other providers can also run it and just about a lot of other things too (also considering that iirc chatgpt recently slashed some old models)

I really appreciate kimi for still open sourcing their complete SOTA and then releasing some research papers on top of them unlike Qwen which has closed source its complete SOTA.

Thank you Kimi!

explorigin•58m ago
https://unsloth.ai/docs/models/kimi-k2.5

Requirements are listed.

KolmogorovComp•22m ago
To save everyone a click

> The 1.8-bit (UD-TQ1_0) quant will run on a single 24GB GPU if you offload all MoE layers to system RAM (or a fast SSD). With ~256GB RAM, expect ~10 tokens/s. The full Kimi K2.5 model is 630GB and typically requires at least 4× H200 GPUs. If the model fits, you will get >40 tokens/s when using a B200. To run the model in near full precision, you can use the 4-bit or 5-bit quants. You can use any higher just to be safe. For strong performance, aim for >240GB of unified memory (or combined RAM+VRAM) to reach 10+ tokens/s. If you’re below that, it'll work but speed will drop (llama.cpp can still run via mmap/disk offload) and may fall from ~10 tokens/s to <2 token/s. We recommend UD-Q2_K_XL (375GB) as a good size/quality balance. Best rule of thumb: RAM+VRAM ≈ the quant size; otherwise it’ll still work, just slower due to offloading.

Gracana•11m ago
I'm running the Q4_K_M quant on a xeon with 7x A4000s and I'm getting about 8 tok/s with small context (16k). I need to do more tuning, I think I can get more out of it, but it's never gonna be fast on this suboptimal machine.
eknkc•27m ago
I've been using it with opencode. You can either use your kimi code subscription (flat fee), moonshot.ai api key (per token) or openrouter to access it. OpenCode works beautifully with the model.

Edit: as a side note, I only installed opencode to try this model and I gotta say it is pretty good. Did not think it'd be as good as claude code but its just fine. Been using it with codex too.

Imustaskforhelp•12m ago
I tried to use opencode for kimi k2.5 too but recently they changed their pricing from 200 tool requests/5 hour to token based pricing.

I can only speak from the tool request based but for some reason anecdotally opencode took like 10 requests in like 3-4 minutes where Kimi cli took 2-3

So I personally like/stick with the kimi cli for kimi coding. I haven't tested it out again with OpenAI with teh new token based pricing but I do think that opencode might add more token issue.

Kimi Cli's pretty good too imo. You should check it out!

https://github.com/MoonshotAI/kimi-cli

zeroxfe•27m ago
Running it via https://platform.moonshot.ai -- using OpenCode. They have super cheap monthly plans at kimi.com too, but I'm not using it because I already have codex and claude monthly plans.
UncleOxidant•6m ago
so there's a free plan at moonshot.ai that gives you some number of tokens without paying?
armcat•1h ago
Out of curiosity, what kind of specs do you have (GPU / RAM)? I saw the requirements and it's a beyond my budget so I am "stuck" with smaller Qwen coders.
Carrok•1h ago
Not OP but OpenCode and DeepInfra seems like an easy way.
zeroxfe•26m ago
I'm not running it locally (it's gigantic!) I'm using the API at https://platform.moonshot.ai
BeetleB•24m ago
Just curious - how does it compare to GLM 4.7? Ever since they gave the $28/year deal, I've been using it for personal projects and am very happy with it (via opencode).

https://z.ai/subscribe

zeroxfe•17m ago
It's waaay better than GLM 4.7 (which was the open model I was using earlier)! Kimi was able to quickly and smoothly finish some very complex tasks that GLM completely choked at.
InsideOutSanta•5m ago
There's no comparison. GLM 4.7 is fine and reasonably competent at writing code, but K2.5 is right up there with something like Sonnet 4.5. it's the first time I can use an open-source model and not immediately tell the difference between it and top-end models from Anthropic and OpenAI.
tgrowazay•24m ago
Just pick up any >240GB VRAM GPU off your local BestBuy to run a quantized version.

> The full Kimi K2.5 model is 630GB and typically requires at least 4× H200 GPUs.

margorczynski•1h ago
I wonder how K2.5 + OpenCode compares to Opus with CC. If it is close I would let go of my subscription, as probably a lot of people.
ithkuil•23m ago
I also wonder if CC can be used with k2.5 with the appropriate API adapter
naragon•21m ago
I've been using K2.5 with OpenCode to do code assessments/fixes and Opus 4.5 with CC to check the work, and so far so good. Very impressed with it so far, but I don't feel comfortable canceling my Claude subscription just yet. Haven't tried it on large feature implementations.
eknkc•15m ago
It is not opus. It is good, works really fast and suprisingly through about its decisions. However I've seen it hallucinate things.

Just today I asked for a code review and it flagged a method that can be `static`. The problem is it was already static. That kind of stuff never happens with Opus 4.5 as far as I can tell.

Also, in an opencode Plan mode (read only). It generated a plan and instead of presenting it and stopping, decided to implement it. Could not use the edit and write tools because the harness was in read only mode. But it had bash and started using bash to edit stuff. Wouldn't just fucking stop even though the error messages it received from opencode stated why. Its plan and the resulting code was ok so I let it go crazy though...

viraptor•1m ago
I wonder if it's not RL'd for CC enough. Usually it respects the plan mode on OC. When it forgets occasionally, it stops after the first failed attempt for me and asks to switch as expected.
derac•52m ago
I really like the agent swarm thing, is it possible to use that functionality with OpenCode or is that a Kimi CLI specific thing? Does the agent need to be aware of the capability?
zeroxfe•20m ago
It seems to work with OpenCode, but I can't tell exactly what's going on -- I was super impressed when OpenCode presented me with a UI to switch the view between different sub-agents. I don't know if OpenCode is aware of the capability, or the model is really good at telling the harness how to spawn sub-agents or execute parallel tool calls.
behnamoh•36m ago
It's a decent model but works best with kimi CLI, not CC or others.
alansaber•25m ago
Why do you think that is?
chillacy•17m ago
I heard it's because the labs fine tune their models for their own harness. Same reason why claude does better in claude code than cursor.
epolanski•26m ago
It's interesting to note that a model that can OpenAI is valued almost 400 times more than moonshotai, despite their models being surprisingly close.
llmslave•18m ago
The benchmarks on all these models are meaningless
miroljub•11m ago
I've been quite satisfied lately with MiniMax M-2.1 in opencode.

How does Kimi 2.5 compare to it in real world scenarios?

viraptor•7m ago
A lot better in my experience. M2.1 to me feels between haiku and sonnet. K2.5 feels close to opus. That's based on my testing of removing some code and getting it to reimplement based on tests. Also the design/spec writing feels great. You can still test k2.5 for free in OpenCode today.

Antirender: remove the glossy shine on architectural renderings

https://antirender.com/
92•iambateman•27m ago•12 comments

Kimi K2.5 Technical Report [pdf]

https://github.com/MoonshotAI/Kimi-K2.5/blob/master/tech_report.pdf
81•vinhnx•3h ago•33 comments

The National Herbarium of Ireland digital collection of Irish plants

https://dri.ie/news/new-collection-in-dri-the-national-herbarium-of-ireland-digital-collection-of...
64•gnabgib•3d ago•6 comments

A judge gave the FBI permission to attempt to bypass biometrics

https://theintercept.com/2026/01/30/washington-post-hannah-natanson-fbi-biometrics-unlock-phone/
51•qingcharles•54m ago•30 comments

Moltbook

https://www.moltbook.com/
1070•teej•16h ago•520 comments

Joel Spolsky: Painless Software Schedules (2000)

https://www.joelonsoftware.com/2000/03/29/painless-software-schedules/
36•MonkeyClub•4d ago•23 comments

OpenClaw – Moltbot Renamed Again

https://openclaw.ai/blog/introducing-openclaw
544•ed•15h ago•280 comments

The engineer who invented the Mars rover suspension in his garage [video]

https://www.youtube.com/watch?v=QKSPk_0N4Jc
202•UltraSane•3d ago•30 comments

Self Driving Car Insurance

https://www.lemonade.com/car/explained/self-driving-car-insurance/
33•KellyCriterion•4h ago•84 comments

Quack-Cluster: A Serverless Distributed SQL Query Engine with DuckDB and Ray

https://github.com/kristianaryanto/Quack-Cluster
50•tanelpoder•3d ago•10 comments

The Home Computer Hybrids

https://technicshistory.com/2026/01/25/the-home-computer-hybrids/
19•cfmcdonald•5d ago•6 comments

Buttered Crumpet, a custom typeface for Wallace and Gromit

https://jamieclarketype.com/case-study/wallace-and-gromit-font/
194•tobr•5h ago•39 comments

Implementing a tiny CPU rasterizer (2024)

https://lisyarus.github.io/blog/posts/implementing-a-tiny-cpu-rasterizer-part-1.html
84•PaulHoule•4d ago•14 comments

Mamdani to kill the NYC AI chatbot caught telling businesses to break the law

https://themarkup.org/artificial-intelligence/2026/01/30/mamdani-to-kill-the-nyc-ai-chatbot-we-ca...
81•jyunwai•2h ago•13 comments

Ode to the AA Battery

https://www.jeffgeerling.com/blog/2026/ode-to-the-aa-battery/
117•Brajeshwar•6h ago•102 comments

HTTP Cats

https://http.cat/
64•surprisetalk•6h ago•13 comments

Show HN: Amla Sandbox – WASM bash shell sandbox for AI agents

https://github.com/amlalabs/amla-sandbox
93•souvik1997•5h ago•63 comments

Emoji Design Convergence Review: 2018-2026

https://blog.emojipedia.org/emoji-design-convergence-review-2018-2026/
37•surprisetalk•3d ago•26 comments

Code is cheap. Show me the talk

https://nadh.in/blog/code-is-cheap/
109•ghostfoxgod•8h ago•98 comments

Building docs like a product

https://emschwartz.me/building-docs-like-a-product/
3•emschwartz•23h ago•0 comments

How AI assistance impacts the formation of coding skills

https://www.anthropic.com/research/AI-assistance-coding-skills
343•vismit2000•14h ago•274 comments

Pangolin (YC S25) is hiring software engineers (open-source, Go, networking)

https://docs.pangolin.net/careers/join-us
1•miloschwartz•8h ago

Bluesky 2025 Transparency Report

https://bsky.social/about/blog/01-29-2026-transparency-report-2025
3•emschwartz•20h ago•0 comments

Email experiments: filtering out external images

https://www.terracrypt.net/posts/email-experiments-image-filtering.html
4•todsacerdoti•8h ago•0 comments

Vcad: Free BRep CAD in the Browser

https://vcad.io
35•ecto•3h ago•11 comments

Wisconsin communities signed secrecy deals for billion-dollar data centers

https://www.wpr.org/news/4-wisconsin-communities-signed-secrecy-deals-billion-dollar-data-centers
290•sseagull•7h ago•318 comments

Netflix Animation Studios Joins the Blender Development Fund as Corporate Patron

https://www.blender.org/press/netflix-animation-studios-joins-the-blender-development-fund-as-cor...
406•vidyesh•14h ago•69 comments

GOG: Linux "the next major frontier" for gaming as it works on a native client

https://www.xda-developers.com/gog-calls-linux-the-next-major-frontier-for-gaming-as-it-works-on-...
572•franczesko•12h ago•314 comments

Grid: Free, local-first, browser-based 3D printing/CNC/laser slicer

https://grid.space/stem/
371•cyrusradfar•21h ago•122 comments

Microsoft 365 now tracks you in real time?

https://ztechtalk.com/microsoft-teams
324•imalerba•3h ago•256 comments