10,500 tokens/SEC per request on Nvidia hardware

https://morphllm.com/blog/morph-breaks-10k-barrier

2•bhaktatejas922•4mo ago

Comments

bhaktatejas922•4mo ago

We just doubled our speculative code edit throughput and hit 10k tok/sec per request!

Morph now merges code at 10,500 tokens/sec — roughly 4× faster than the best speeds on Cerebras.

That kind of speed makes previously impractical workloads trivial: applying complex edits across a 40k-token document now takes under 4 seconds. This isn’t a vanity metric - we think it unlocks an entirely new domain of AI use cases where codebases, configs, or long documents can be semantically edited in real time.

Morph is a Fast Apply model dedicated to merging edits from frontier LLMs We want to enable developers to build realtime interfaces with AI

NitpickLawyer•4mo ago

Help me understand. Is this for cases where you have a file and you "ask" an LLM to change something, and they reply in chat mode with something like < //--unchanged code \n changed line \n changed line \n //----remaining code unchanged > ?

If so, isn't this flow like 6mo old, and not really used anymore? The latest tools (terminal based and vscode extensions like cline/roo/kilo) already support "diff edits", where the model outputs a diff format that the tool speaks. I get "instant" edits that way, right in my IDE, and model support has been great (gpt5,claude4,gemini2.5,grok-fast-1, etc.)

So what's the use case of this model, then? Cool technical results, and congrats, but it seems the "field" has already solved for this particular problem?

bhaktatejas922•4mo ago

https://morphllm.com/benchmarks

Using fast apply is more reliable at first pass and is faster/cheaper. You prompt your agent to output in a lazy format and our model learns hoe to merge it in.

The ides listed typically do turn based search and replace which uses more tokens and takes longer

Kilo supports morph as well!

anon191928•4mo ago

SEC ? they will be against this. They have been against financial innovation and if they see this they will be against this too. SEC is special. sec is ok

Zram as Swap

Green’s Dictionary of Slang - Five hundred years of the vulgar tongue

Nvidia CEO Says AI Capital Spending Is Appropriate, Sustainable

Show HN: StyloShare – privacy-first anonymous file sharing with zero sign-up

Part 1 the Persistent Vault Issue: Your Encryption Strategy Has a Shelf Life

Show HN: Teleop_xr – Modular WebXR solution for bimanual robot teleoperation

The Highest Exam: How the Gaokao Shapes China

Open-source framework for tracking prediction accuracy

India's Sarvan AI LLM launches Indic-language focused models

Show HN: CryptoClaw – open-source AI agent with built-in wallet and DeFi skills

ShowHN: Make OpenClaw respond in Scarlett Johansson’s AI Voice from the Film Her

CReact Version 0.3.0 Released

Show HN: CReact – AI Powered AWS Website Generator

The rocky 1960s origins of online dating (2025)

Show HN: Agent-fetch – Sandboxed HTTP client with SSRF protection for AI agents

Why there is no official statement from Substack about the data leak

Effects of Zepbound on Stool Quality

Show HN: Seedance 2.0 – The Most Powerful AI Video Generator

Ask HN: Do we need "metadata in source code" syntax that LLMs will never delete?

Pentagon cutting ties w/ "woke" Harvard, ending military training & fellowships

Can Quantum-Mechanical Description of Physical Reality Be Considered Complete? [pdf]

Kessler Syndrome Has Started [video]

Complex Heterodynes Explained

MemAlign: Building Better LLM Judges from Human Feedback with Scalable Memory

CCC (Claude's C Compiler) on Compiler Explorer

Homeland Security Spying on Reddit Users

Actors with Tokio (2021)

Can graph neural networks for biology realistically run on edge devices?

Deeper into the shareing of one air conditioner for 2 rooms

Weatherman introduces fruit-based authentication system to combat deep fakes