Hi everyone! this weekend I shipped a quant for the Flash-Base model in the deepseek V4 series. I posted all the quality, throughput and verification metrics in the repo:
It is the full 284B params in 157 GiB at full FP8 speed. I ran most of my tests on 4 H100s with about 320 GB of VRAM.
mandeepj•1h ago
Would you mind sharing your bill?
saivegasena•1h ago
I built it autonomously using my company's AI research agents, so it was technically free for me. The total time was 80 experiments and a total of 49hrs. I checked rent rates which were for 6.771$ an hour so ~$350 dollars which seemed pretty worth imo.
saivegasena•1h ago
https://huggingface.co/EnsueAI/DeepSeek-V4-Flash-Base-INT4
lmk what you think!
It is the full 284B params in 157 GiB at full FP8 speed. I ran most of my tests on 4 H100s with about 320 GB of VRAM.
mandeepj•1h ago
saivegasena•1h ago