frontpage.

We’ve been benchmarking a few models on our API platform and got some interesting performance numbers: - MiniMax M2.5 → 0.118s time-to-first-token, 103 tokens/sec - GLM 5.1 → 120 tokens/sec throughput - Kimi K2.5 → 0.643s TTFT, 69 tokens/sec - All models → ~99.9% request success rate The latency difference is especially noticeable, ~0.1s TTFT feels almost instant in interactive apps. Let me know how you're evaluating LLM APIs. Are you optimizing more for latency, throughput, or cost?

Show HN: API for 13M+ Indian court cases with citation graphs and vector search

Show HN: How to Install Docker on Ubuntu 24.04 LTS: Complete 2026 Guide

Show HN: A $4/mo AI todo app – 3 todos free, pay for the 4th

Hiring Software Engineer – Ruby on Rails at Dope Marketing

Show HN: Remoroo. trying to fix memory in long-running coding agents

Show HN: Smriti, version control for reasoning state

Finding a duplicated item in an array of N integers in the range 1 to N − 1

The new rules for AI-assisted code in the Linux kernel: What devs need to know

Cloudflare Mesh – secure private networking: users, nodes, agents, Workers

The Focus Spectrum

Some Texas neighborhoods are seeing feral hogs for the first time

Rare concert records going on Internet Archive

Systems Engineering: Building Agentic Software That Works

Fidelis Security: Detect and Respond to Threats 9x Faster

Silicon Valley Is Spending Millions to Stop One of Its Own

A lockfile could have saved you from the LiteLLM attack

Compressed 586K tokens to 5K with 100% accuracy retention – here's the math

Economics of Curing Cancer (RoR of 1024%)

Show HN: Claude Control – macOS dashboard for managing Claude Code sessions

The Economics of Proof of Concept

I Feel So Sorry for My A.I. Sunglasses

Single-Dose Creatine Reduces Sleep Deprivation-Induced Deterioration

Raspberry Pi as an isolated AI coding server

Apple Maps Ads Move Closer to Launch with iOS 26.5 Beta 2

Two Months After I Gave an AI $100 and No Instructions

Show HN: Nobulex – Prove what your AI agent did, cryptographically (live demo)

Tonone-AI/elephant: Persistent memory for Claude Code. Never forget a session

Show HN: Who'Studios – A Social Design Platform

Lithium Yay

Notes on Beautiful Nonfiction

Model API Performance