That said, "worth it" still depends heavily on your hardware. A 4070 Ti gets you a very different answer than a 3060.
Disclosure: I'm building localllm-advisor.com, free and client-side, which also helps answer these types of questions. It shows which models fit your GPU with quantization options and estimated tok/s, or which GPU you'd need to run a specific model. Relevant to the question so I'm mentioning it, but take it for what it is.
politelemon•1h ago
ostefani•38m ago