Show HN: We Built a Serverless GPU Platform with Fast Cold Starts

3•ayankovsky•21h ago

We built a Serverless GPU platform with low cold starts that is perfect for running your custom ML models (LLMs, image generation etc.).

We started with our own pain. We were running a generative AI startup and needed to run a Stable Diffusion pipeline with custom LoRA. We found that running a custom model on a cloud GPU means either a steep fixed cost (using traditional cloud providers) or extreme cold starts of several minutes (using serverless GPU providers).

We looked at successful non-GPU providers and came up with a hypothesis that still holds true today: we don't need to support custom Docker images, we can create just one environment that will run any model.

Of course, that alone did not solve the cold start. We had to work hard optimizing our platform to load and unload models as quickly as possible. We ended up building a pre-download mechanism and manipulating the page cache to load the predicted next model faster.

We wanted to make it as easy as possible for our clients to migrate and also to learn as much as possible, so we started offering free assistance in adapting models. We learned that improving cold starts is not just about the platform. It also depends on how the model is loaded.

This way we helped several teams running LLMs and image generation improve their ML-related features for users (reducing wait time) and often reduced costs.

Try our platform here: https://dat1.co

We'd love to hear your thoughts on anything related to the subject.

Thanks, Arseny.

Comments

sprocketus•21h ago

That's interesting. I have couple of questions: 1) How long it would take me to try out ? It seems that I cannot copy paste some snippets quickly. 2) What makes you better then let's say modal or replicate ?

ayankovsky•21h ago

Depending on the model, it could take from minutes to a couple of hours to adapt it and deploy to our platform. The process is quite easy if you want to run say an LLM (check an example project here https://github.com/dat1-co/dat1-model-examples/tree/main/lla...).

As to why we're better, I'd say a couple of reasons: lower cold start, more transparent pricing and human-first approach where we will work with you to make your model run in the best way possible.

sprocketus•18h ago

ok, I'll try thank you

nikitos4319•21h ago

Do you have trial or smth to try? I didn't see it right away

ayankovsky•21h ago

Thanks for the question! Yes, we do offer a free first month as a trial period (which we can extend of course). We should make it much more obvious on the website.

Immortal SSH Sessions

'Hello World' in Bismuth

Does Betteridge's Law Still Apply?

Meta Admits There's a Goldilocks Zone for VR Session Length Due to Form Factor

Show HN: WTMF: An AI Companion for Late-Night Thoughts – Launching Next Week

Australia's productivity commission proposes cashflow tax to boost investment

OpenCQRS – an open-source CQRS framework for the JVM

Show HN: I made a website to find relevant conversations about your brand

My first browser extensions|speed up AEO with generated content to copy & paste

Amazon DocumentDB Serverless is now available

Why Won't Anyone Use the Beautiful Corporate Spaces

Google ADK and AMD Instinct GPUs: The Dynamic Duo for AI Agents

How to Build a Satellite?

'This wasn't obvious': the potato evolved from a tomato ancestor

Onshape – Product Development Platform

Quadratic Voting

Brightest explosion ever seen is still baffling astronomers

Subagents.sh – Share and discover Claude Code sub-agents

Bbor62 – A compact binary-to-text compressor

Top Anonymous Email Services for Privacy Lovers

Fujitsu starts development of 10000 plus superconducting quantum computer

I built a free, open-source security scanner with shareable dashboards

US Energy Department misrepresents climate science in new report

The Art of Parsing and Comparing Version Strings

One diet soft drink daily may increase diabetes risk by more than a third

Isle FPGA Computer

Ask HN: How do I sandbox Gemini Code Assist on Mac from accessing other files?

China struggles to break its addiction to manufacturing [Financial Times]

Why Japanese Developers Write Code Differently – Why It Works Better

Ubiquiti users report having access to others' UniFi routers, cameras (2023)