This is consistent with my own experiments with local LLM models. I've also been experimenting with chaining the model to flagship LLMs when it is incapable of achieving the outcome using Langgraph.
It is good as a learning tool or to use in applications where latency is not a concern but there can be a high number of requests. But, not yet fast enough to do auto-complete on an IDE for instance.
dreamer7•10h ago
It is good as a learning tool or to use in applications where latency is not a concern but there can be a high number of requests. But, not yet fast enough to do auto-complete on an IDE for instance.