Kimi-K2 Tech Report [pdf]

https://github.com/MoonshotAI/Kimi-K2/blob/main/tech_report.pdf

84•swyx•6mo ago

Comments

dang•6mo ago

Related. Others?

China's moonshot launches free AI model Kimi K2 that outperforms GPT4 - https://news.ycombinator.com/item?id=44575309 - July 2025 (3 comments)

Kimi K2 and when "DeepSeek Moments" become normal - https://news.ycombinator.com/item?id=44561565 - July 2025 (2 comments)

Kimi K2 is a state-of-the-art mixture-of-experts (MoE) language model - https://news.ycombinator.com/item?id=44533403 - July 2025 (178 comments)

jtrn•6mo ago

The results without the fluff:

Model Architecture * Type: Mixture-of-Experts (MoE) transformer model. * Total Parameters: 1 trillion. * Activated Parameters: 32 billion. * Experts: 384 total experts, with 8 activated per token. * Attention Heads: 64.

Pre-training * Optimizer: A novel optimizer named MuonClip was used. It integrates the Muon optimizer with a QK-Clip mechanism to address training instability. * Dataset: The model was pre-trained on 15.5 trillion tokens. * Training Process: Kimi K2 was trained with zero loss spikes. The initial context window was 4,096 tokens, later extended to 128k tokens using the YaRN method.

Post-training * The model underwent a multi-stage process featuring a large-scale agentic data synthesis pipeline and a joint reinforcement learning (RL) stage. * The RL framework combines verifiable rewards with a self-critique rubric reward mechanism. * A data synthesis pipeline generated tens of thousands of tool-use training examples.

Performance Benchmarks (non-thinking mode) * SWE-bench Verified: 65.8%. * SWE-bench Multilingual: 47.3%. * LiveCodeBench v6: 53.7%. * OJBench: 27.1%. * Tau2-Bench micro-average: 66.1. * ACEBench (en): 76.5. * AIME 2025: 49.5. * GPQA-Diamond: 75.1. * LMSYS Arena Leaderboard (July 17, 2025): Ranked 1st among open-source models and 5th overall.

chisleu•6mo ago

It looks like qwen3-coder is going to steal K2's thunder in terms of agentic coding use.

jadbox•6mo ago

Maybe so, but currently I like the sound of K2's writing more so than qwen3 (so far in my testing).

swyx•6mo ago

(hi i'm OP) kimi k2 was released a while ago with some headlines like muonclip already discussed* but the tech report is new so submitted here. their own highlights are here: https://x.com/Kimi_Moonshot/status/1947520758760313170

we just covered it today on the latent.space paper club if you want to listen along while reading this paper https://youtu.be/VHwZa7lZhK8

definitely see also sebastian raschka's writeup https://t.co/oEt8XzNxik

*background on muon and muonclip https://www.youtube.com/watch?v=fcTNQLebHb0

OutOfHere•6mo ago

It has a small context length of just 128K.

Trying to make an Automated Ecologist: A first pass through the Biotime dataset

Watch Ukraine's Minigun-Firing, Drone-Hunting Turboprop in Action

Free Trial: AI Interviewer

FDA Intends to Take Action Against Non-FDA-Approved GLP-1 Drugs

Supernote e-ink devices for writing like paper

We are QA Engineers now

Show HN: Measuring how AI agent teams improve issue resolution on SWE-Verified

Adversarial Reasoning: Multiagent World Models for Closing the Simulation Gap

Show HN: Poddley.com – Follow people, not podcasts

Layoffs Surge 118% in January – The Highest Since 2009

Papyrus 114: Homer's Iliad

DicePit – Real-time multiplayer Knucklebones in the browser

Turn-Based Structural Triggers: Prompt-Free Backdoors in Multi-Turn LLMs

Show HN: AI Agent Tool That Keeps You in the Loop

Why Every R Package Wrapping External Tools Needs a Sitrep() Function

Achieving Ultra-Fast AI Chat Widgets

Show HN: Runtime Fence – Kill switch for AI agents

Researchers surprised by the brain benefits of cannabis usage in adults over 40

Peter Thiel warns the Antichrist, apocalypse linked to the 'end of modernity'

USS Preble Used Helios Laser to Zap Four Drones in Expanding Testing

Show HN: Animated beach scene, made with CSS

An update on unredacting select Epstein files – DBC12.pdf liberated

Was going to share my work

Pitchfork: A devilishly good process manager for developers

You Are Here

Why social apps need to become proactive, not reactive

How patient are AI scrapers, anyway? – Random Thoughts

Vouch: A contributor trust management system

I built a terminal monitoring app and custom firmware for a clock with Claude

Tiny C Compiler