We've been working on making Gaussian Splatting scenes searchable. A common approach embeds high-dimensional semantic features directly into the model points, increasing training complexity and memory usage.
We tried a simpler post-process approach:
- Index the source 2D imagery with embeddings
- Use an LMM to localize the queried object in the source frames
- Use known camera poses to raycast and project those 2D detections into 3D space
This allows for "Ctrl+F" style search on standard 3DGS models without modifying the training pipeline. If you search for a list of items, it’s possible to auto-tag an entire scene in parallel.
There is a demo linked in the post if you want to try it out. Happy to answer questions about the implementation!
cpk26•1h ago
We've been working on making Gaussian Splatting scenes searchable. A common approach embeds high-dimensional semantic features directly into the model points, increasing training complexity and memory usage.
We tried a simpler post-process approach: - Index the source 2D imagery with embeddings - Use an LMM to localize the queried object in the source frames - Use known camera poses to raycast and project those 2D detections into 3D space
This allows for "Ctrl+F" style search on standard 3DGS models without modifying the training pipeline. If you search for a list of items, it’s possible to auto-tag an entire scene in parallel.
There is a demo linked in the post if you want to try it out. Happy to answer questions about the implementation!