frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: AI-optimized x86-64 assembly vs. GCC -O3 on three production kernels

https://github.com/cleonard2341/ai-kernel-optimizer/blob/main/blog/ai-assembly-vs-gcc-o3.md
1•cod-e•1h ago
Show HN: AI-generated assembly vs GCC -O3 on real codebases (300K fuzz, 0 failures) Three kernels extracted from real open source projects, optimized with AI-generated x86-64 assembly, verified with 100K differential fuzz each: KernelAI strategySpeedupVerdictBase64 decodeSSSE3 pshufb table-free lookup4.8–6.3xAI winsLZ4 fast decodeSSE 16-byte match copy~1.05xAI wins (marginal)Redis SipHashReordered SIPROUND scheduling0.97xGCC wins The base64 win: GCC can't auto-vectorize a 256-byte lookup table (it's a gather pattern). The AI replaces it with a pshufb nibble trick — 16 parallel lookups in one instruction, zero table accesses. 1.8 GB/s → 11.6 GB/s. The SipHash loss: on pure ALU kernels (adds, rotates, XORs), GCC's scheduler is already near-optimal. 300K total fuzz iterations, zero mismatches. Every result is one command to reproduce.

Comments

cod-e•1h ago
Author here. Some context on how this works and what it doesn't do. The system doesn't replace the compiler. It sits on top of it. The key insight (which took a few failed experiments to learn) is that AI-generated assembly is dangerous for code with error handling, state, and control flow — but strong on pure computational kernels. We tried having the AI rewrite an entire packet parser. It shipped two bugs (flag clobbering, unsigned underflow) and was 1.23x slower than GCC. Then we split the architecture: compiler owns all structural code (validation, error paths, bounds checks, state management), AI only optimizes the inner kernel after all checks pass. Same parser, zero bugs on first try, clean performance win. That's the design principle behind everything here. The compiler guarantees correctness by construction. The AI only touches pure load/transform/store kernels with no branches. Then we verify with 100K differential fuzz — run random inputs through both versions, compare output byte-by-byte. What the AI is good at: spotting SIMD opportunities GCC misses. The base64 case is textbook — GCC sees a 256-byte lookup table and generates scalar loads. The AI recognizes that base64's alphabet can be decomposed into nibble ranges and uses pshufb to do 16 parallel lookups. That's not a novel technique (simdjson and others use it), but the point is the AI found and applied it automatically. What the AI is bad at: pure ALU scheduling. SipHash is adds, rotates, and XORs with tight data dependencies. GCC's instruction scheduler already does this near-optimally. The AI tried and lost. The system reports that honestly. The verification reports and build scripts are in the repo — every number is one shell command to reproduce. Happy to answer questions about the architecture, the failure cases, or where this goes next.

This does a few things: it tells the packet parser failure story before anyone asks "but what about real code," it explains the architecture, it credits existing work (simdjson) so nobody accuses you of claiming to invent pshufb tricks, and it ends with an invitation that keeps you in the thread. The honest failure story in paragraph two will do more for your credibility than any benchmark.

Michael Abrash doubled Quake framerste

https://fabiensanglard.net/quake_asm_optimizations/index.html
1•chunkles•4m ago•0 comments

Alexei Navalny Was Murdered

1•eimrine•10m ago•0 comments

Show HN: Tufte Editor – Local Markdown Editor with Tufte CSS Live Preview

https://github.com/onedeeper/tufteeditor
1•avngr86•13m ago•0 comments

No Coding Before 10am

https://michaelxbloch.substack.com/p/no-coding-before-10am
1•imartin2k•14m ago•0 comments

The Medal Comes After the Meme

https://mikaelpawlo.substack.com/p/the-medal-comes-after-the-meme
1•imartin2k•15m ago•0 comments

The Demise of Conflict Studies

https://dissentmagazine.org/article/the-demise-of-conflict-studies/
1•hackandthink•16m ago•0 comments

What the hell is Forth? (2019)

https://blog.information-superhighway.net/what-the-hell-is-forth
2•tosh•17m ago•0 comments

Oat – Ultra-lightweight, semantic, zero-dependency HTML UI component library

https://oat.ink/
2•twapi•18m ago•0 comments

Claude Code Tips from the Guy Who Built It

https://www.anup.io/35-claude-code-tips-from-the-guy-who-built-it/
2•todsacerdoti•21m ago•0 comments

I Turned an ESP32 into a Thermal USB Webcam

https://www.youtube.com/watch?v=jyhVxC0ipE8
1•iamflimflam1•25m ago•0 comments

ByteDance Seed 2.0

https://seed.bytedance.com/en/seed2
1•tosh•25m ago•0 comments

Gemini's mobile app inherits Google's location permissions

https://support.google.com/gemini/answer/14554984?hl=en&co=GENIE.Platform%3DAndroid
1•leogout•29m ago•0 comments

Solve Everything

https://solveeverything.org/
1•o4c•33m ago•0 comments

Jailbreaking Google Translate

https://twitter.com/elder_plinius/status/2020933759533465658
1•helsinkiandrew•35m ago•0 comments

Show HN: GPACalc – Free GPA and CGPA Calculator (4.0/5.0/10.0 scales)

https://gpacalc.app/
1•YidaDev•38m ago•1 comments

Project Oberon: A Late Appraisal (2025)

https://www.youtube.com/watch?v=hZyNFaojbew
1•pjmlp•39m ago•0 comments

Marching Morons; a Year in Books; AI Character Names

https://bernoff.com/blog/marching-morons-a-year-in-books-ai-character-names-newsletter-4-february...
1•jruohonen•39m ago•0 comments

AI Shifts Concern from Technical Debt to Cognitive Debt

https://margaretstorey.com/blog/2026/02/09/cognitive-debt/
3•reasonableklout•39m ago•0 comments

Need Help, the Softraid and Lvm

1•areslee•40m ago•0 comments

Engineers are becoming sorcerers – Future of software dev with OpenAI Sherwin Wu

https://www.lennysnewsletter.com/p/engineers-are-becoming-sorcerers
1•rocho•42m ago•0 comments

Show HN: Ktrack – A simple, offline workout tracker

https://play.google.com/store/apps/details?id=com.companyname.ktrack&hl=en
1•KhashayarCodes•43m ago•0 comments

Are productivity gains due to AI hard-sell where you work?

1•newsicanuse•49m ago•0 comments

Show HN: LanceCalc – Open-source freelance platform fee calculator

https://github.com/asmahdi08/LanceCalc
1•ASMahdi•54m ago•0 comments

Show HN: Agent Lens – Code assistant observability in VSCode

https://github.com/23min/agent-lens
2•pjettter•1h ago•0 comments

Apple Rankings by the Appleist Brian Frange

https://applerankings.com/
3•Rant423•1h ago•1 comments

Saving the SpaceOrb360 with open source hardware and software (2024) [video]

https://www.youtube.com/watch?v=5K_E0J65uUg
3•starkparker•1h ago•0 comments

There's a Reason American Kids Are Such Picky Eaters

https://www.nytimes.com/2026/02/15/opinion/junk-food-picky-eaters.html
5•metadat•1h ago•1 comments

Watching Code Fly By

https://www.natemeyvis.com/on-watching-code-fly-by/
2•ingve•1h ago•0 comments

Show HN: DepGuard – Local dependency audit and license compliance (10 pkg mgrs)

https://github.com/suhteevah/depguard
2•suhteevah•1h ago•0 comments

Hamming Distance for Hybrid Search in SQLite

https://notnotp.com/notes/hamming-distance-for-hybrid-search-in-sqlite/
3•enz•1h ago•0 comments