The fastvlm seems to be just doing images, not video, like i wave my hand and it says user is holding his hand up. I think its just doing one image at a time, great speed and accuracy though just on a webgpu on my macbook air m4
Ps do u know any that does a video feed?
slacka•4h ago
MobileCLIP2: https://huggingface.co/collections/apple/mobileclip2-68ac947...