news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Orthrus-Qwen3: up to 7.8×tokens/forward on Qwen3, identical output distribution

https://github.com/chiennv2000/orthrus

32•FranckDernoncou•7h ago

Comments

FranckDernoncou•7h ago

Paper: https://arxiv.org/abs/2605.12825 ; Code+models: https://github.com/chiennv2000/orthrus ; Disclosure: co-author.

Idea: Inject a trainable diffusion attention module into each layer of a frozen AR Transformer. Both heads share one KV cache. Diffusion head projects K=32 tokens in parallel; AR head verifies in a second pass and accepts the longest matching prefix. Output distribution is provably identical to the base model.

Results:

- Up to 7.8x TPF, ~6x wall-clock on MATH-500.

- 16% of params trained, <1B tokens, 24h on 8xH200.

- vs. diffusion LMs (Dream, Fast-dLLM-v2, SDAR, Mercury, Gemini Diffusion): they modify base weights and lose accuracy (Fast-dLLM-v2: -11 pts on MATH-500). Orthrus freezes the backbone; accuracy matches Qwen3-8B exactly.

- vs. Speculative Decoding (EAGLE-3, DFlash): no external drafter, no separate cache, zero TTFT penalty (no drafter to init/sync). KV overhead is O(1) (~4.5 MiB flat). Acceptance length on MATH-500: 11.7 vs. 7.9 (DFlash) vs. 3.5 (EAGLE-3).

- Single-step denoising beats multi-step (6.35 vs. 3.53 TPF). KL distillation beats CE on acceptance rate.

Limitations: strictly bounded by the frozen base model (inherits its biases, hallucinations, knowledge gaps); Qwen3-only evaluation; greedy + rejection sampling only.

ilaksh•1h ago

Amazing. Is it possible to do this with Qwen 3.6 27B? Will it work with quants (I assume so)?

xiphias2•28m ago

The most interesting part of this idea for me is how it wasn't tried / implemented before, as it makes sense.

I haven't read the paper but of course DTree tricks work here as well

Project Gutenberg – keeps getting better

https://www.gutenberg.org/

827•JSeiko•13h ago•184 comments

Ploopy Bean: a trackpoint for every computer

https://ploopy.co/shop/bean-pointing-stick/

49•jibcage•3d ago•17 comments

I believe there are entire companies right now under AI psychosis

https://twitter.com/mitchellh/status/2055380239711457578

1084•reasonableklout•9h ago•497 comments

Additive Blending on the Nintendo 64

https://phoboslab.org/log/2026/05/n64-additive-blending

73•ibobev•15h ago•7 comments

The bird eye was pushed to an evolutionary extreme

https://www.quantamagazine.org/how-the-bird-eye-was-pushed-to-an-evolutionary-extreme-20260513/

62•sohkamyung•1d ago•13 comments

The main thing about P2P meth is that there's so much of it (2021)

https://dynomight.net/p2p-meth/

91•tomjakubowski•6h ago•84 comments

England Runestones

https://en.wikipedia.org/wiki/England_runestones

19•cl3misch•2d ago•2 comments

Research on mildew contamination affecting the sound quality of analog tapes

https://www.nature.com/articles/s40494-026-02592-7

5•crousto•1d ago•0 comments

SQL patterns I use to catch transaction fraud

https://analytics.fixelsmith.com/posts/sql-fraud-patterns/

72•redbell•6h ago•15 comments

Naturally Occurring Quasicrystals

https://johncarlosbaez.wordpress.com/2026/05/14/naturally-occurring-quasicrystals/

83•lukeplato•1d ago•6 comments

A 0-click exploit chain for the Pixel 10

https://projectzero.google/2026/05/pixel-10-exploit.html

353•happyhardcore•16h ago•167 comments

How to Write to SSDs [pdf]

https://www.vldb.org/pvldb/vol19/p1469-lee.pdf

76•matt_d•7h ago•6 comments

NYT and Vaping: How to Lie by Saying Only True Things (2022)

https://gwern.net/vaping

65•Ariarule•5h ago•13 comments

I Bought a “Junk” PSP From Japan

https://gardinerbryant.com/i-bought-a-junk-psp-from-japan-heres-how-it-went/

28•Kate0CoolLibby•3d ago•10 comments

California bill would require patches or refunds when online games shut down

https://arstechnica.com/gaming/2026/05/bill-to-keep-online-games-playable-clears-key-hurdle-in-ca...

410•Lihh27•10h ago•255 comments

The sigmoids won't save you

https://www.astralcodexten.com/p/the-sigmoids-wont-save-you

179•Tomte•18h ago•176 comments

ESP-EEG is an affordable 8-channel biosensing board

https://www.autodidacts.io/cerelog-esp-eeg-affordable-openbci-like-board/

37•surprisetalk•2d ago•13 comments

Show HN: Epiq – Distributed Git based issue tracker TUI

https://ljtn.github.io/epiq/

36•jolaflow•5h ago•12 comments

Erlang/OTP 29.0

https://www.erlang.org/news/188

160•pyinstallwoes•6h ago•20 comments

I designed a nibble-oriented CPU in Verilog to build a scientific calculator

https://github.com/gdevic/FPGA-Calculator

96•gdevic•12h ago•33 comments

Orthrus-Qwen3: up to 7.8×tokens/forward on Qwen3, identical output distribution

https://github.com/chiennv2000/orthrus

33•FranckDernoncou•7h ago•3 comments

Image-blaster: Creates 3D environments, SFX, and meshes from a single image

https://github.com/neilsonnn/image-blaster

147•MattRogish•14h ago•28 comments

The Zulip Foundation

https://blog.zulip.com/2026/05/15/announcing-zulip-foundation/

244•boramalper•11h ago•62 comments

U.S. DOJ demands Apple and Google unmask over 100k users of car-tinkering app

https://macdailynews.com/2026/05/15/u-s-doj-demands-apple-and-google-unmask-over-100000-users-of-...

399•tencentshill•12h ago•271 comments

'No way to prevent this,' says only package manager where this regularly happens

https://kevinpatel.xyz/posts/no-way-to-prevent-this/

283•alligatorplum•5h ago•122 comments

Show HN: Watch a neural net learn to play Snake

https://ppo.gradexp.xyz/

136•c1b•1d ago•33 comments

ASCII by Jason Scott

https://ascii.textfiles.com/

171•bookofjoe•15h ago•22 comments

Radicle: Sovereign {code forge} built on Git

https://radicle.dev/

229•KolmogorovComp•17h ago•78 comments

ABC News has taken all FiveThirtyEight articles offline

https://twitter.com/baseballot/status/2055309076209492208

300•cmsparks•10h ago•141 comments

Waymo updates 3,800 robotaxis after they 'drive into standing water'

https://www.cnbc.com/2026/05/12/waymo-recalls-3800-robotaxis-after-able-drive-into-standing-water...

175•drob518•11h ago•171 comments