frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Tell HN: Another round of Zendesk email spam

97•Philpax•16h ago•47 comments

Ask HN: Is Connecting via SSH Risky?

11•atrevbot•8h ago•18 comments

Ask HN: Who wants to be hired? (February 2026)

136•whoishiring•2d ago•441 comments

Ask HN: Has your whole engineering team gone big into AI coding? How's it going?

3•jchung•6h ago•1 comments

Ask HN: Who is hiring? (February 2026)

305•whoishiring•2d ago•465 comments

We built a serverless GPU inference platform with predictable latency

3•QubridAI•6h ago•1 comments

Ask HN: Mem0 stores memories, but doesn't learn user patterns

8•fliellerjulian•15h ago•6 comments

Ask HN: Where does operational truth live before it reaches "systems of record"?

2•former-aws•13h ago•3 comments

Ask HN: Do you still use physical calculators?

58•speedylight•5d ago•124 comments

Ask HN: Is there anyone here who still uses slide rules?

120•blenderob•1d ago•121 comments

YC S26 Application: "Attach a coding agent session you're particularly proud of"

4•simplydt•18h ago•1 comments

Google Cloud suspended my account for 2 years, only automated replies

159•andylizf•4d ago•99 comments

Kernighan on Programming

166•chrisjj•2d ago•59 comments

Ask HN: When will LLMs generate professional-level CAD models?

8•dsrtslnd23•18h ago•5 comments

Ask HN: Does anyone have interests in anything besides AI?

8•drsalt•9h ago•7 comments

Ask HN: Are ISPs "evil" and who runs the Internet?

5•tavro•22h ago•2 comments

How do you manage context/memory across multiple AI tools?

7•arapkuliev•22h ago•5 comments

Ask HN: Cheap laptop for Linux without GUI (for writing)

11•locusofself•1d ago•15 comments

Ask HN: OpenClaw users, what is your token spend?

14•8cvor6j844qw_d6•2d ago•6 comments

Ask HN: Have you been fired because of AI?

15•s-stude•2d ago•15 comments

Ask HN: Anyone have a "sovereign" solution for phone calls?

9•kldg•1d ago•1 comments

GitHub Actions Have "Major Outage"

52•graton•2d ago•17 comments

Ask HN: Has anybody moved their local community off of Facebook groups?

21•madsohm•3d ago•15 comments

Ask HN: What weird or scrappy things did you do to get your first users?

13•preston-kwei•2d ago•8 comments

Ask HN: Tech Debt War Stories

6•erubini_fg•1d ago•8 comments

Ask HN: Does a good "read it later" app exist?

5•buchanae•1d ago•16 comments

Ask HN: Are you still using spec driven development?

6•cherry_tree•2d ago•5 comments

My small SaaS got recommended my Google in the AI search overview

4•kaave•2d ago•3 comments

Signal Is Down

40•Daniel_sk•1d ago•10 comments

Why do people still talk about AGI?

42•cermicelli•3d ago•64 comments
Open in hackernews

We built a serverless GPU inference platform with predictable latency

3•QubridAI•6h ago
We’ve been working on a GPU-first inference platform focused on predictable latency and cost control for production AI workloads.

Some of the engineering problems we ran into:

- GPU cold starts and queue scheduling - Multi-tenant isolation without wasting VRAM - Model loading vs container loading tradeoffs - Batch vs real-time inference routing - Handling burst workloads without long-term GPU reservation - Cost predictability vs autoscaling behavior

We wrote up the architecture decisions, what failed, and what worked.

Happy to answer technical questions - especially around GPU scheduling, inference optimization, and workload isolation.

Comments

tgrowazay•6h ago
Well, do you have a blog post or we need to ask about each item to get it?