In video games this tends to be a function of framerate, with higher FPS being more likely to produce coil whine as the GPU power draw oscillates at a higher frequency. I assume there's something analogous in LLM runtimes when the outer loop spins faster or slower.
behnamoh•30m ago
in this case, the smaller glm mxfp4 model actually runs slower than the much larger m2.1 model.
jsheard•59m ago
behnamoh•30m ago