Segment anything however was able to segment all 5 dog legs when prompted to. Which means that meta is doing something else under the hood here, and may lend itself to a very powerful future LLM.
Right now some of the biggest complaints people have with LLMs stems from their incompetence processing visual data. Maybe meta is onto something here.
I would recommend bounding boxes.
I'm using small and medum.
Also the code for using it is very short and easy to use. You can also use ChatGPT to generate small exepriments to see what fits your case better
They are super fast.
Its just an alternative i'm mentioning. I would assume a person knowing a little bit of that domain.
Otherwise the first option would be CLIP i assume. llm-vl is just super slow and compute intensive.
Of course CLIP would be otherwise the other option than a big llm-vl one.
So the human results should have a clean mesh. But that’s separate from whatever pipeline they use for non-human objects.
Checkout https://github.com/MiscellaneousStuff/meta-sam-demo
It's a rip of the previous sam playground. I use it for a bunch of things.
Sam 3 is incredible. I'm surprised it's not getting more attention.
Remember, it's not the idea, it's the marketing!
trevorhlynn•2mo ago
https://news.ycombinator.com/item?id=45982073
stronglikedan•2mo ago
dang•2mo ago
Meta Segment Anything Model 3 - https://news.ycombinator.com/item?id=45982073 - Nov 2025 (133 comments)
p.s. This was lobbed onto the frontpage by the second-chance pool (https://news.ycombinator.com/item?id=26998308) and I need to make sure we don't end up with duplicate threads that way.