Faster inference won't save you

https://graphcoder.ai/blog/faster-inference-wont-save-you

3•ramstar3000•1h ago

Comments

shreyash3087•1h ago

The latency table says it all. Cloud-to-cloud is 40ms for 20 turns. Hotel Wi-Fi is 16 seconds. You can halve inference time and still have a broken product on bad connections.

Var1377•14m ago

is this an LLM?

Var1377•13m ago

does this mean you can disconnect from the internet entirely with the agent loop still running?

ramstar3000•7m ago

yes this is central to our thesis :)