newest
Open in hackernews
Optimising DeepSeek-R1-Distill-Qwen-7B for use in production
https://fin.ai/research/think-fast-reasoning-at-3ms-a-token/
1
•
destraynor
•
2h ago