news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Kenosis – Quantize ONNX models without breaking kernel fusion

https://github.com/CoreEpoch/kenosis

2•CoreEpoch•58m ago

Comments

CoreEpoch•57m ago

I was working on a camera inference project and tried to quantize my .onnx models using the ORT Python quantizer and realized the models were becoming slower, which didn't make sense to me.

So I started making my own quantize tool in Rust and as I was peeling back the layers, running tests, etc I figured out the ORT quantizer places the Dequantize node between the Conv and the ReLU, breaking kernel fusion. But if you move the Dequantize after the activation instead the rewrite is mathematically identical — max(0, x · scale) == max(0, x) · scale for scale > 0 — but allows the runtime to fuse the kernels and you get a large boost in speed at the same accuracy. I've tested it on production environments as well, however it doesn't work great on YOLO currently, but tbh I'm not pressed to get it tuned up for YOLO at the moment.

Some quick benchmarks (i5-13420H CPU, single-threaded ORT):

SqueezeNet: 2.32x faster than FP32 (ORT quantizer made it slower than FP32)

ResNet50: 2.46x faster than FP32, 40% faster than ORT's quantizer

Zero Python dependencies, single native binary, self-calibrating. Apache-2.0.

Technical write-up: https://coreepoch.dev/research/kenosis-activation-aware-quan...

cargo install kenosis-cli kenosis quantize model.onnx -o model_int8.onnx --static-int8

If there is an AI bubble, where is it?

https://gregoryap.substack.com/p/if-there-is-an-ai-bubble-where-is

1•gphil•47s ago•0 comments

Introducing Opus 4.8

https://old.reddit.com/r/ClaudeAI/comments/1tq99mu/introducing_claude_opus_48/

1•baroiall•1m ago•0 comments

Show HN: Secure or Broken? A puzzle game about the pain of configuring CSP

https://cspradar.com/tools/csp-defender

1•itsdevdaniel•2m ago•0 comments

How the EU's plan to turbocharge Italy's economy fell flat

https://www.ft.com/content/152088a1-9615-4485-9669-fb8300f60b13

1•malshe•3m ago•1 comments

Generative Recursive ReAsoning Models (Gram)

https://arxiv.org/abs/2605.19376

1•ijidak•3m ago•0 comments

Why American Parents Send Their Kids to 'Russian Math' (2017)

https://www.mathschool.com/blog/news-and-events/npr-why-thousands-of-american-parents-are-sending...

1•Tomte•5m ago•0 comments

How HN: Tokenscope – see what your Claude Code session cost

https://github.com/wartzar-bee/tokenscope

1•wartzarbee•6m ago•0 comments

Bricks and Minifigs Stole a Man's $200k Lego Collection

https://mybricklog.com/blog/bricks-minifigs-corporate-stole-old-mans-200000-lego-collection

2•philips•8m ago•0 comments

To Gen or Not to Gen: The Ethical Use of Generative AI

https://blog.johanneslink.net/2025/11/04/to-gen-or-not-to-gen/

1•archagon•8m ago•1 comments

New Phishing Technique Vaultjacking: One Captured Pin, the Password Vault

https://phishu.net/blogs/blog-vaultjacking-phishing-the-google-password-manager-vault-in-the-phis...

1•curtbraz•8m ago•0 comments

Using Claude Code with GPT 5.5, Gemini 3.5, Grok 4.3, and other models

https://dechained.ai

2•sryDarioXOXO•10m ago•0 comments

The AI Resist List

https://airesistlist.org/

3•lylo•10m ago•0 comments

Announcing Rust 1.96

https://blog.rust-lang.org/2026/05/28/Rust-1.96.0/

4•adamch•12m ago•0 comments

Built a chat-first AI personal operator in 48h – need 5 honest beta testers

https://operatoros-web-czi4.onrender.com

2•tomdieter•12m ago•0 comments

A Double Shot of DuckDB

https://peterdohertys.website/blog-posts/double-shot-of-duck.html

2•ethagnawl•13m ago•0 comments

William Joseph "Wild Bill" Donovan, New York

https://gratitude250.substack.com/p/william-joseph-wild-bill-donovan

2•rbanffy•13m ago•0 comments

Death of Security by Obscurity

https://blog.reqproof.com/p/death-of-security-by-obscurity

3•LeonidBugaev•15m ago•0 comments

Nitpicking the shell history scene in 'Tron: Legacy'

https://www.chiark.greenend.org.uk/~sgtatham/quasiblog/tron-legacy/

6•speckx•16m ago•1 comments

Training more efficient smol models with SSTT and Canonical Entity IDs

https://github.com/beaglabs/sst/

1•jdbohrman•18m ago•0 comments

Separate the Cord from the Device

https://bookofjoe2.blogspot.com/2026/05/blog-post_27.html

3•bookofjoe•18m ago•0 comments

Folding in Parallel

https://okmij.org/ftp/Algorithms/map-monoid-reduce.html

1•thunderbong•18m ago•0 comments

US government prepares to print $250 note featuring Trump's face

https://www.bbc.com/news/articles/clypeyx6nemo

1•tartoran•18m ago•0 comments

Elon Musk boosted false USAID conspiracy theories to shut down global aid

https://www.nbcnews.com/politics/doge/elon-musk-boosted-false-usaid-conspiracy-theories-global-ai...

4•tastyface•19m ago•0 comments

Associative learning turns DEET from aversive to appetitive in Aedes aegypti

https://journals.biologists.com/jeb/article/229/10/jeb251935/371741/Associative-learning-switches...

1•croes•20m ago•0 comments

EU wants crisis powers to seize control of chip supplies

https://www.ft.com/content/9d7d6204-4fc7-4f1d-af05-473c3649efcd

2•merksittich•20m ago•0 comments

Create the Space: They're waiting to show up

https://opensourceonpurpose.substack.com/p/create-the-space

1•taubek•20m ago•0 comments

Nobody talks about the AI bubble anymore

1•xchip•21m ago•2 comments

gRPC Studio, open sourced web UI for managing gRPC

https://medium.com/@pranavpsawant/building-a-reflection-based-grpc-explorer-with-streaming-and-au...

1•pranavpsawant•22m ago•0 comments

Demo: Fold your coding sessions into LLM weights

https://app.scalarlmforge.com/blog/introducing-orbital

1•gdiamos•25m ago•0 comments

How Online Sleuthing Helped Catch the ‘Google Insider’ on Polymarket

https://www.wsj.com/finance/currencies/how-online-sleuthing-helped-catch-the-google-polymarket-tr...

3•thm•27m ago•1 comments