I didn’t want to spin up a VM every time or use browser-only tools, so I built a tiny CLI that runs local .cu files on Modal GPUs with a single command.
Modal gives around $30 in free credits when you sign up, so the CLI is built around that. You can run simple kernels on smaller GPUs (like T4) for very little. I ran multiple tests for under $1.
What it does: - Sends your .cu file to a Modal container with a GPU image (you can override it) - Compiles it with nvcc (supports nvcc args) - Runs it on any GPU you choose (T4 through B200) - Streams output/error back to your terminal
Usage: uv add modal-cuda
mcc myKernel.cu --gpu H100
# or:
uvx --from modal-cuda mcc myKernel --nvcc-arg=-Xptas --nvcc-arg=-v
Links:
GitHub: https://github.com/ExpressGradient/modal-cuda
PyPI: https://pypi.org/project/modal-cuda