frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Serve 100 Large AI models on a single GPU with low impact to TTFT

https://github.com/leoheuler/flashtensors
7•leonheuler•3mo ago
I wanted to build an inference provider for proprietary AI models, but I did not have a huge GPU farm. I started experimenting with Serverless AI inference, but found out that coldstarts were huge. I went deep into the research and put together an engine that loads large models from SSD to VRAM up to ten times faster than alternatives. It works with vLLM, and transformers, and more coming soon.

With this project you can hot-swap entire large models (32B) on demand.

Its great for:

Serverless AI Inference

Robotics

On Prem deployments

Local Agents

And Its open source.

Let me know if anyone wants to contribute :)

Comments

billconan•3mo ago
can you hot swap a portion of an ai model, if my gpu is not large enough to hold the entire model? so that I can run half model first and load the other half.

Was going to share my work

1•hiddenarchitect•3m ago•0 comments

Pitchfork: A devilishly good process manager for developers

https://pitchfork.jdx.dev/
1•ahamez•3m ago•0 comments

You Are Here

https://brooker.co.za/blog/2026/02/07/you-are-here.html
2•mltvc•8m ago•0 comments

Why social apps need to become proactive, not reactive

https://www.heyflare.app/blog/from-reactive-to-proactive-how-ai-agents-will-reshape-social-apps
1•JoanMDuarte•8m ago•1 comments

How patient are AI scrapers, anyway? – Random Thoughts

https://lars.ingebrigtsen.no/2026/02/07/how-patient-are-ai-scrapers-anyway/
1•samtrack2019•9m ago•0 comments

Vouch: A contributor trust management system

https://github.com/mitchellh/vouch
1•SchwKatze•9m ago•0 comments

I built a terminal monitoring app and custom firmware for a clock with Claude

https://duggan.ie/posts/i-built-a-terminal-monitoring-app-and-custom-firmware-for-a-desktop-clock...
1•duggan•10m ago•0 comments

Tiny C Compiler

https://bellard.org/tcc/
1•guerrilla•11m ago•0 comments

Y Combinator Founder Organizes 'March for Billionaires'

https://mlq.ai/news/ai-startup-founder-organizes-march-for-billionaires-protest-against-californi...
1•hidden80•12m ago•1 comments

Ask HN: Need feedback on the idea I'm working on

1•Yogender78•12m ago•0 comments

OpenClaw Addresses Security Risks

https://thebiggish.com/news/openclaw-s-security-flaws-expose-enterprise-risk-22-of-deployments-un...
1•vedantnair•13m ago•0 comments

Apple finalizes Gemini / Siri deal

https://www.engadget.com/ai/apple-reportedly-plans-to-reveal-its-gemini-powered-siri-in-february-...
1•vedantnair•13m ago•0 comments

Italy Railways Sabotaged

https://www.bbc.co.uk/news/articles/czr4rx04xjpo
3•vedantnair•13m ago•0 comments

Emacs-tramp-RPC: high-performance TRAMP back end using MsgPack-RPC

https://github.com/ArthurHeymans/emacs-tramp-rpc
1•fanf2•15m ago•0 comments

Nintendo Wii Themed Portfolio

https://akiraux.vercel.app/
1•s4074433•19m ago•1 comments

"There must be something like the opposite of suicide "

https://post.substack.com/p/there-must-be-something-like-the
1•rbanffy•21m ago•0 comments

Ask HN: Why doesn't Netflix add a “Theater Mode” that recreates the worst parts?

2•amichail•22m ago•0 comments

Show HN: Engineering Perception with Combinatorial Memetics

1•alan_sass•28m ago•2 comments

Show HN: Steam Daily – A Wordle-like daily puzzle game for Steam fans

https://steamdaily.xyz
1•itshellboy•30m ago•0 comments

The Anthropic Hive Mind

https://steve-yegge.medium.com/the-anthropic-hive-mind-d01f768f3d7b
1•spenvo•30m ago•0 comments

Just Started Using AmpCode

https://intelligenttools.co/blog/ampcode-multi-agent-production
1•BojanTomic•32m ago•0 comments

LLM as an Engineer vs. a Founder?

1•dm03514•32m ago•0 comments

Crosstalk inside cells helps pathogens evade drugs, study finds

https://phys.org/news/2026-01-crosstalk-cells-pathogens-evade-drugs.html
2•PaulHoule•34m ago•0 comments

Show HN: Design system generator (mood to CSS in <1 second)

https://huesly.app
1•egeuysall•34m ago•1 comments

Show HN: 26/02/26 – 5 songs in a day

https://playingwith.variousbits.net/saturday
1•dmje•34m ago•0 comments

Toroidal Logit Bias – Reduce LLM hallucinations 40% with no fine-tuning

https://github.com/Paraxiom/topological-coherence
1•slye514•37m ago•1 comments

Top AI models fail at >96% of tasks

https://www.zdnet.com/article/ai-failed-test-on-remote-freelance-jobs/
5•codexon•37m ago•2 comments

The Science of the Perfect Second (2023)

https://harpers.org/archive/2023/04/the-science-of-the-perfect-second/
1•NaOH•38m ago•0 comments

Bob Beck (OpenBSD) on why vi should stay vi (2006)

https://marc.info/?l=openbsd-misc&m=115820462402673&w=2
2•birdculture•42m ago•0 comments

Show HN: a glimpse into the future of eye tracking for multi-agent use

https://github.com/dchrty/glimpsh
1•dochrty•42m ago•0 comments