frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

LLM Inference Handbook

https://bentoml.com/llm/
150•djhu9•10h ago

Comments

sherlockxu•5h ago
Hi everyone. I'm one of the maintainers of this project. We're both excited and humbled to see it on Hacker News!

We created this handbook to make LLM inference concepts more accessible, especially for developers building real-world LLM applications. The goal is to pull together scattered knowledge into something clear, practical, and easy to build on.

We’re continuing to improve it, so feedback is very welcome!

GitHub repo: https://github.com/bentoml/llm-inference-in-production

armcat•3h ago
Amazing work on this, beautifully put together and very useful!
aligundogdu•4h ago
It's a really beautiful project, and I’d like to ask something purely out of curiosity and with the best intentions. What’s the name of the design trend you used for your website? I really loved the website too.
holografix•1h ago
Very good reference thanks for collating this!
subset•1h ago
Ooh this looks really neat! I'd love to see more content in the future on Structured outputs/Guided generation and sampling. Another great reference on inference-time algorithms for sampling is here: https://rentry.co/samplers
qrios•48m ago
Thanks for putting this together! From now on I only need one link to point interested ones to learn.

Only one suggestion: On page "OpenAI-compatible API" it would be great to have also a simple example for the pure REST call instead of the need to import the OpenAI package.

Bill Atkinson's Psychedelic User Interface

https://patternproject.substack.com/p/from-the-mac-to-the-mystical-bill
92•cainxinth•2h ago•32 comments

Recovering from AI Addiction

https://internetaddictsanonymous.org/internet-and-technology-addiction/signs-of-an-addiction-to-ai/
67•pera•1h ago•34 comments

AI Agent Benchmarks Are Broken

https://ddkang.substack.com/p/ai-agent-benchmarks-are-broken
15•neehao•20m ago•1 comments

At Least 13 People Died by Suicide Amid U.K. Post Office Scandal, Report Says

https://www.nytimes.com/2025/07/10/world/europe/uk-post-office-scandal-report.html
132•xbryanx•1h ago•80 comments

Show HN: Pangolin – Open source alternative to Cloudflare Tunnels

https://github.com/fosrl/pangolin
325•miloschwartz•15h ago•69 comments

OpenFront: Realtime Risk-like multiplayer game in the browser

https://openfront.io/
109•thombles•6h ago•33 comments

Postgres LISTEN/NOTIFY does not scale

https://www.recall.ai/blog/postgres-listen-notify-does-not-scale
479•davidgu•3d ago•217 comments

The day someone created 184 billion Bitcoin (2020)

https://decrypt.co/39750/184-billion-bitcoin-anonymous-creator
34•lawrenceyan•8h ago•37 comments

Apple vs the Law

https://formularsumo.co.uk/blog/2025/apple-vs-the-law/
268•tempodox•6h ago•228 comments

LLM Inference Handbook

https://bentoml.com/llm/
151•djhu9•10h ago•6 comments

Using Sound Waves to Put Out Fire: Story of Two George Mason University Students

https://wowparrot.com/using-sound-waves-to-put-out-fire/
10•taubek•2h ago•2 comments

'Click-to-cancel' rule is blocked

https://apnews.com/article/ftc-click-to-cancel-30db2be07fdcb8aefd0d4835abdb116a
21•gok•34m ago•12 comments

Batch Mode in the Gemini API: Process More for Less

https://developers.googleblog.com/en/scale-your-ai-workloads-batch-mode-gemini-api/
122•xnx•3d ago•43 comments

FP8 is ~100 tflops faster when the kernel name has "cutlass" in it

https://twitter.com/cis_female/status/1943069934332055912
117•limoce•2h ago•43 comments

The ChompSaw: A Benchtop Power Tool That's Safe for Kids to Use

https://www.core77.com/posts/137602/The-ChompSaw-A-Benchtop-Power-Tool-Thats-Safe-for-Kids-to-Use
211•surprisetalk•3d ago•130 comments

Show HN: Interactive pinout for the Raspberry Pi Pico 2

https://pico2.pinout.xyz
78•gadgetoid•3d ago•19 comments

What is Realtalk’s relationship to AI? (2024)

https://dynamicland.org/2024/FAQ/#What_is_Realtalks_relationship_to_AI
266•prathyvsh•22h ago•84 comments

Flix – A powerful effect-oriented programming language

https://flix.dev/
298•freilanzer•23h ago•147 comments

Btrfs Allocator Hints

https://lwn.net/ml/all/cover.1747070147.git.anand.jain@oracle.com/
27•forza_user•2d ago•7 comments

Show HN: Cactus – Ollama for Smartphones

https://github.com/cactus-compute/cactus
180•HenryNdubuaku•18h ago•67 comments

Series of posts on HTTP status codes (2018)

https://evertpot.com/http/
59•antonalekseev•2d ago•9 comments

FOKS: Federated Open Key Service

https://foks.pub/
254•ubj•1d ago•55 comments

Underwater turbine spinning for 6 years off Scotland's coast is a breakthrough

https://apnews.com/article/tidal-energy-turbine-marine-meygen-scotland-ffff3a7082205b33b612a1417e1ec6d6
213•djoldman•23h ago•187 comments

At Amazon's Biggest Data Center, Everything Is Supersized for A.I

https://www.nytimes.com/2025/06/24/technology/amazon-ai-data-centers.html
49•pseudolus•3h ago•33 comments

Graphical Linear Algebra

https://graphicallinearalgebra.net/
268•hyperbrainer•21h ago•21 comments

Red Hat Technical Writing Style Guide

https://stylepedia.net/style/
234•jumpocelot•22h ago•122 comments

Show HN: Open source alternative to Perplexity Comet

https://www.browseros.com/
248•felarof•19h ago•92 comments

Things I learned from 5 years at Vercel

https://leerob.com/vercel
3•gk1•25m ago•0 comments

Grok: Searching X for "From:Elonmusk (Israel or Palestine or Hamas or Gaza)"

https://simonwillison.net/2025/Jul/11/grok-musk/
486•simonw•13h ago•338 comments

Operational Apple-1 Computer for sale [video]

https://www.youtube.com/watch?v=XdBKuBhdZwg
58•guiambros•2d ago•27 comments