frontpage.

We’re sharing KernelEvolve, an agentic system we built at Meta to automatically generate and evolve high-performance kernels across heterogeneous AI accelerators.

The core motivation is that modern AI stacks increasingly depend on hand-optimized kernels (GEMM, attention, reductions, fused ops), but writing and tuning them for each hardware target (NVIDIA GPUs, AMD GPUs, custom accelerators like MTIA) does not scale.

KernelEvolve treats kernel programming as a search + evolution problem:

• An LLM generates candidate kernels (e.g., Triton-like code) • Kernels are compiled, benchmarked, and validated on real hardware • Performance feedback is used to evolve better variants over many iterations • The system scales evaluation across large fleets and multiple accelerator types

Unlike one-shot code generation, KernelEvolve continuously improves kernels using closed-loop, hardware-in-the-loop feedback, and can discover non-obvious optimizations that rival or exceed expert-written code.

In the paper we describe:

• The agent architecture and search space design • How we scale kernel evaluation efficiently across heterogeneous accelerators • Case studies showing performance gains over hand-tuned baselines • Practical lessons from deploying this system in production ML workloads

Paper (arXiv): https://arxiv.org/abs/2512.23236 (66 pages)

LinkedIn: https://www.linkedin.com/posts/gangliao_excited-to-share-our-recent-work-on-kernelevolve-activity-7411781675740897280-AQth?utm_source=share&utm_medium=member_desktop&rcm=ACoAAAzsrfsBRed-BvPAGqq9FgvVZ-v6F-sG4SM

We’d love feedback from folks working on compilers, kernels, ML systems, or agentic approaches to code generation.

Near-Instantly Aborting the Worst Pain Imaginable with Psychedelics

Show HN: Nginx-defender – realtime abuse blocking for Nginx

The Super Sharp Blade

Smart Homes Are Terrible

What I haven't figured out

KPMG pressed its auditor to pass on AI cost savings

Open-source Claude skill that optimizes Hinge profiles. Pretty well.

First Proof

I squeezed a BERT sentiment analyzer into 1GB RAM on a $5 VPS

Kagi Translate

Building Interactive C/C++ workflows in Jupyter through Clang-REPL [video]

Tactical tornado is the new default

Full-Circle Test-Driven Firmware Development with OpenClaw

Automating Myself Out of My Job – Part 2

Google staff call for firm to cut ties with ICE

Dependency Resolution Methods

Crypto firm apologises for sending Bitcoin users $40B by mistake

Show HN: iPlotCSV: CSV Data, Visualized Beautifully for Free

There's no such thing as "tech" (Ten years later)

List of unproven and disproven cancer treatments

Me/CFS: The blind spot in proactive medicine (Open Letter)

Ask HN: What are the word games do you play everyday?

Show HN: Paper Arena – A social trading feed where only AI agents can post

TOSTracker – The AI Training Asymmetry

The Devil Inside GitHub

Show HN: Distill – Migrate LLM agents from expensive to cheap models

Show HN: Sigma Runtime – Maintaining 100% Fact Integrity over 120 LLM Cycles

Make a local open-source AI chatbot with access to Fedora documentation

Introduce the Vouch/Denouncement Contribution Model by Mitchellh

Software Factories and the Agentic Moment