frontpage.

I've been reading and experimenting with vLLM but it seems that each day there are more and more articles and AI generated long form about each part of the stack. I have a few GPUs and work for a private education group. I want to run models internally and distribute access to a research team. I don't want to have one (or more) GPU per user neither train models. CUrrently I am doing well with a local Qwen on my own single server but I can't wrap my head around on which part to tackle - right now I am looking to KV caches and building over vLLM but I wanted something simple and secure to not leak data from one session to another.

British Columbia to make daylight saving time permanent

Prison guards discussed cover-up of Epstein's death, inmate tells FBI

Patching minified Claude Code so it can hear webhooks

Show HN: Navtee – Golf course directory and navigation app

ReactScope

Show HN: Qry – CLI web search that always outputs JSON, with swappable back ends

Show HN: SafeAgent – exactly-once execution guard for AI agent side effects

Show HN: Open-source personal finance AI that runs locally on your laptop

Forcing Flash Attention onto a TPU and Learning the Hard Way

Mechanical Movements Animated

Show HN: Pipe Checker – paste a sales deal and it checks BANT qualification

Juno – J Web IDE

ReverseLM Playground

Integrating AI-Driven Predictive Analytics for Cybersecurity Risk Mitigation [pdf]

Msspproviders.io: a searchable directory of managed security service providers

Old Versions of Programs, Drivers and Games

Zero Sum Game

Karabiner-Elements is a powerful tool for customizing keyboards on macOS

Coruna: The Mysterious Journey of a Powerful iOS Exploit Kit

How Vinay Prasad Came to Washington, and Why It Was Always Going to End This Way

Judge Voids Mass Layoffs at Voice of America

Scaling and controlling an army of devices in parallel with voice commands

Plenty of AI hype, but not much useful software?

Show HN: Yumo.to, a map of 19,652 onsens in Japan

Show HN: I built a $5/mo Jobber alternative for solo carpenter

Americans Are Now a Target for ICE

All Bench Leaderboard for Comparing LLMs Across Benchmarks

Multimodal Coding Agents as In-Context Policy Learners for Robot Manipulation

Kalshi and Polymarket Are Each Eyeing Roughly $20B Valuations

State of WASI support for CPython: March 2026

Ask HN: How to serve inference as we do with containes with cached token