frontpage.

Hi HN — I built Tailor, a small teaching/learning repo that’s basically a “mini inference engine” prototype for LLM decoding.

It includes:

1. Paged KV cache (block_size=1) + page-table semantics Trie/radix prefix cache with reference-counted KV blocks (safe prefix reuse) 2. Attention metadata builder (page_table / cu_seqlens / positions / out_loc) 3. A simple KV-capacity-bounded scheduler (admission control + continue-batching)

It’s inspired by nano-vllm and mini-sglang, but not a direct copy — I re-implemented components step-by-step to understand how the pieces fit, with help from GPT-5.2. The scheduler policy is intentionally simple (learning-first).

Performance note: with 80,000 blocks allocated, I measured ~1990 tokens/s on Llama 3.2 1B on a laptop RTX 4070.

Repo: https://github.com/tyfeng1997/tailor

Ask HN: How much of your token use is fixing the bugs Claude Code causes?

Show HN: Agents – Sync MCP Configs Across Claude, Cursor, Codex Automatically

Hello

FSD helped save my father's life during a heart attack

Show HN: Writtte – Draft and publish articles without reformatting, anywhere

Portuguese icon (FROM A CAN) makes a simple meal (Canned Fish Files) [video]

Brookhaven Lab's RHIC Concludes 25-Year Run with Final Collisions

Transcribe your aunts post cards with Gemini 3 Pro

.72% Variance Lance

ReKindle – web-based operating system designed specifically for E-ink devices

Encrypt It

NextMatch – 5-minute video speed dating to reduce ghosting

Personalizing esketamine treatment in TRD and TRBD

SpaceKit.xyz – a browser‑native VM for decentralized compute

NotebookLM: The AI that only learns from you

Show HN: An open-source starter kit for developing with Postgres and ClickHouse

Game Boy Advance d-pad capacitor measurements

South Korean crypto firm accidentally sends $44B in bitcoins to users

Apache Poison Fountain

Web.whatsapp.com appears to be having issues syncing and sending messages

Google in Your Terminal

Shannon: Claude Code for Pen Testing: #1 on Github today

Anthropic: Latest Claude model finds more than 500 vulnerabilities

Brooklyn cemetery plans human composting option, stirring interest and debate

Why the 'Strivers' Are Right

Brain Dumps as a Literary Form

Agentic Coding and the Problem of Oracles

Malicious packages for dYdX cryptocurrency exchange empties user wallets

Show HN: I built a <400ms latency voice agent that runs on a 4gb vram GTX 1650"

Penisgate erupts at Olympics; scandal exposes risks of bulking your bulge

Show HN: A mini paged-KV and prefix-cache scheduler (learning inference engine)