3-8 t/s seems pretty sluggish for interactive coding. I guess it works for background agents but for a human loop that latency is tough.
Also hard to justify the $50k capex compared to just hitting the Anthropic API. You'd need massive volume to break even on that hardware especially with electricity costs. Seems like overoptimization unless you have strict data privacy needs.
storystarling•10m ago
Also hard to justify the $50k capex compared to just hitting the Anthropic API. You'd need massive volume to break even on that hardware especially with electricity costs. Seems like overoptimization unless you have strict data privacy needs.