news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Tony Hoare has died

https://blog.computationalcomplexity.org/2026/03/tony-hoare-1934-2026.html

1091•speckx•5h ago•152 comments

Agents that run while I sleep

https://www.claudecodecamp.com/p/i-m-building-agents-that-run-while-i-sleep

79•aray07•1h ago•58 comments

RISC-V Is Sloooow

https://marcin.juszkiewicz.com.pl/2026/03/10/risc-v-is-sloooow/

17•todsacerdoti•17m ago•5 comments

Yann LeCun raises $1B to build AI that understands the physical world

https://www.wired.com/story/yann-lecun-raises-dollar1-billion-to-build-ai-that-understands-the-ph...

157•helloplanets•11h ago•266 comments

Launch HN: RunAnywhere (YC W26) – Faster AI Inference on Apple Silicon

https://github.com/RunanywhereAI/rcli

142•sanchitmonga22•3h ago•55 comments

HyperCard discovery: Neuromancer, Count Zero, Mona Lisa Overdrive (2022)

https://macintoshgarden.org/apps/neuromancer-count-zero-mona-lisa-overdrive

34•naves•1h ago•5 comments

Widevine retiring its Cloud License Service (CLS)

https://castlabs.com/blog/widevine-retiring-cloud-license-service/

30•dabinat•1h ago•22 comments

Debian decides not to decide on AI-generated contributions

https://lwn.net/SubscriberLink/1061544/125f911834966dd0/

215•jwilk•5h ago•170 comments

Billion-Parameter Theories

https://www.worldgov.org/complexity.html

68•seanlinehan•2h ago•37 comments

FFmpeg-over-IP – Connect to remote FFmpeg servers

https://github.com/steelbrain/ffmpeg-over-ip

38•steelbrain•2h ago•17 comments

After outages, Amazon to make senior engineers sign off on AI-assisted changes

https://arstechnica.com/ai/2026/03/after-outages-amazon-to-make-senior-engineers-sign-off-on-ai-a...

19•ndr42•6h ago•201 comments

Levels of Agentic Engineering

https://www.bassimeledath.com/blog/levels-of-agentic-engineering

47•bombastic311•11h ago•26 comments

Intel Demos Chip to Compute with Encrypted Data

https://spectrum.ieee.org/fhe-intel

184•sohkamyung•7h ago•67 comments

Redox OS has adopted a Certificate of Origin policy and a strict no-LLM policy

https://gitlab.redox-os.org/redox-os/redox/-/blob/master/CONTRIBUTING.md

328•pjmlp•11h ago•337 comments

Open Weights isn't Open Training

https://www.workshoplabs.ai/blog/open-weights-open-training

46•addiefoote8•20h ago•15 comments

Rebasing in Magit

https://entropicthoughts.com/rebasing-in-magit

153•ibobev•6h ago•107 comments

I put my whole life into a single database

https://howisfelix.today/

374•lukakopajtic•10h ago•182 comments

Defeat as Method

https://www.cabinetmagazine.org/issues/71/khosravi.php

27•akbarnama•3h ago•2 comments

Show HN: How I Topped the HuggingFace Open LLM Leaderboard on Two Gaming GPUs

https://dnhkng.github.io/posts/rys/

197•dnhkng•7h ago•72 comments

Meta acquires Moltbook

https://www.axios.com/2026/03/10/meta-facebook-moltbook-agent-social-network

293•mmayberry•5h ago•197 comments

Because Algospeak

https://www.tbray.org/ongoing/When/202x/2026/03/05/Because-Algospeak

14•zdw•2d ago•2 comments

Launch HN: Didit (YC W26) – Stripe for Identity Verification

40•rosasalberto•5h ago•44 comments

I built a programming language using Claude Code

https://ankursethi.com/blog/programming-language-claude-code/

76•GeneralMaximus•3h ago•104 comments

Surpassing vLLM with a Generated Inference Stack

https://infinity.inc/case-studies/qwen3-optimization

21•lukebechtel•5h ago•6 comments

I used pulsar detection techniques to turn a phone into a watch timegrapher

https://www.chronolog.watch/timegrapher

47•tylerjaywood•3d ago•12 comments

RFC 454545 – Human Em Dash Standard

https://gist.github.com/bignimbus/a75cc9d703abf0b21a57c0d21a79e2be

102•jdauriemma•5h ago•95 comments

Roblox is minting teen millionaires

https://www.bloomberg.com/news/articles/2026-03-06/roblox-s-teen-millionaires-are-disrupting-the-...

8•petethomas•2d ago•0 comments

Converting Binary Floating-Point Numbers to Shortest Decimal Strings

https://onlinelibrary.wiley.com/doi/10.1002/spe.70056

9•matt_d•3d ago•0 comments

The Gervais Principle, or the Office According to “The Office” (2009)

https://www.ribbonfarm.com/2009/10/07/the-gervais-principle-or-the-office-according-to-the-office/

260•janandonly•3d ago•114 comments

The Enterprise Context Layer

https://andychen32.substack.com/p/the-enterprise-context-layer

30•zachperkel•5h ago•5 comments

Open in hackernews

Surpassing vLLM with a Generated Inference Stack

https://infinity.inc/case-studies/qwen3-optimization

21•lukebechtel•5h ago

Comments

ntonozzi•1h ago

Why do they need to run benchmarks to confirm performance? Can't they run an example prompt and verify they get the exact same output token probabilities for all prompts? The fact that they are not doing this makes me suspicious that they are in fact not doing the exact same thing as vLLM.

It is also a bit weird that they are not incorporating speculative decoding, that seems like a critical performance optimization, especially for decode heavy workloads.

lukebechtel•1h ago

Yes, speculative decoding will make both us and VLLM faster, but we believe it would be a relatively even bump on both sides, so we didn't include it in this comparison. Worth another test!

rfw300•1h ago

OK... we need way more information than this to validate this claim! I can run Qwen-8B at 1 billion tokens per second if you don't check the model's output quality. No information is given about the source code, correctness, batching, benchmark results, quantization, etc. etc. etc.

lukebechtel•1h ago

We validate with MMLU and Hellaswag presently, and are getting this independently verified by a 3rd party.

We have considered open-sourcing some of our optimized inference libraries in the future, but have not yet come to a decision on this.

Also if you need a rough intuition as to why this is possible: it's because this entire inference stack was built for exactly one model, and thus we can really tune the entire framework accordingly.

acuozzo•10m ago

Luke: Do you have benchmarks for BF16?

lukebechtel•6m ago

Unfortunately, not at present; we went for FP8 because we believed it was generally the best tradeoff of quality and speed. Allowed faster iteration as well.

We believe our improvements would hold on BF16, but let me check.