We started with our own pain. We were running a generative AI startup and needed to run a Stable Diffusion pipeline with custom LoRA. We found that running a custom model on a cloud GPU means either a steep fixed cost (using traditional cloud providers) or extreme cold starts of several minutes (using serverless GPU providers).
We looked at successful non-GPU providers and came up with a hypothesis that still holds true today: we don't need to support custom Docker images, we can create just one environment that will run any model.
Of course, that alone did not solve the cold start. We had to work hard optimizing our platform to load and unload models as quickly as possible. We ended up building a pre-download mechanism and manipulating the page cache to load the predicted next model faster.
We wanted to make it as easy as possible for our clients to migrate and also to learn as much as possible, so we started offering free assistance in adapting models. We learned that improving cold starts is not just about the platform. It also depends on how the model is loaded.
This way we helped several teams running LLMs and image generation improve their ML-related features for users (reducing wait time) and often reduced costs.
Try our platform here: https://dat1.co
We'd love to hear your thoughts on anything related to the subject.
Thanks, Arseny.
sprocketus•21h ago
ayankovsky•21h ago
As to why we're better, I'd say a couple of reasons: lower cold start, more transparent pricing and human-first approach where we will work with you to make your model run in the best way possible.
sprocketus•18h ago