I'm a senior Rails dev who started self-hosting AI models on my own GPU (7800X3D + 4070 Ti Super) last year. The key insight was building a shared GpuJob base class (~130 lines) that handles Redis-based GPU locking, retries, and health tracking — every tool inherits from it, so adding a new one is about 60-80 lines of tool-specific code. The agent workflow (Hermes running in Discord) is where it gets wild — I voice-message an idea and it builds the feature, commits, pushes a PR, and deploys on merge. Happy to answer questions about the architecture, the self-hosting setup, or the agent workflow.
frogr•2h ago