Gaussian Splatting Meets ROS2

61•shadygm•7mo ago

Comments

arijun•7mo ago

This page is pretty light on the what and why. I gather it’s using ROS (which I had to look up to confirm means robot operating system) to render Gaussian splatting. And that’s faster than a dedicated GPU renderer? Doesn’t ROS add overhead in the form of message passing?

inhumantsar•7mo ago

it's for visualizing a robot's camera data in 3d space

shadygm•7mo ago

Hey! Great question, and thanks for taking a look!

The main idea behind ROSplat is to make it easier to send and visualize Gaussians over the network, especially in robotics applications. For instance, imagine you're running a SLAM algorithm on a mobile robot and generating Gaussians as part of the mapping or localization process. With ROSplat, you can stream those Gaussians via ROS messages and visualize them live on another machine. It’s mostly a visualization tool that usess ROS for communication, making it accessible and convenient for robotics engineers and researchers already working within that ecosystem.

Just to clarify, ROSplat isn’t aiming to be faster than state-of-the-art rendering methods. The actual rendering is done with OpenGL, not ROS, so there’s no performance claim there. ROS is just used for the messaging, which does introduce a bit of overhead, but the benefit is in the ease of integration and live data sharing in robotics setups.

Also, I wrote a simple technical report explaining some things in more detail, you can find it in the repo!

Hope that clears things up a bit!

hirako2000•7mo ago

Confused here despite the detailed explanation on the user case.

Today generating a static point cloud with gaussians involves:

- offline, far from realtime process to generate spacial information off 2D captures. LiDar captures may help but doesn't drastically cut down the this heavy step. - "train" generate gaussian information off 2D captures and geospatial data.

Unless I'm already referring to an antique flow, or that my RTX GPU is too consumer grade, how would all of this perform on embedded systems to make fast communication of gaussian relevant ?

shadygm•7mo ago

There's some algorithms, such as Photo-SLAM and Gaussian Splatting SLAM (although far heavier and slower), that show that it is indeed possible to be able to estimate position and generate Gaussians in real-time. These are definitely still the early days for these techniques tho.

The offline method still generates significantly higher resolution scenes of course, but as time goes on, real-time Gaussian Splatting will become more common and will be close to offline methods.

This means that in the near future, we will be able to generate highly realistic scenes using Gaussian Splats on a smart edge + mobile robot in real-time and pass the splats via ROS onto another device running ROSplat (or other) and perform the visualisation there.

hirako2000•7mo ago

OK. Thanks for your projections.

I generate on GPU I can barely fit a large scene on 12GB of memory, and it takes many hours to produce 30k steps gaussians.

I'm sure the tech will evolve, hardware too. We are just 5y away.

I respect you open sourcing your work, it is innovative. Feels like a trophy splash, I suggest putting a link to something substantial, perhaps a page explaining where the tech will land and how this project fits that future, rather than a link to some LinkedIn.

shadygm•7mo ago

Hey, I appreciate the feedback.

I did not put a LinkedIn link in the post or repo, but I totally get your point about wanting something more substantial to explain the bigger picture.

A lot of the motivation and reasoning behind the project is already included in the technical report PDF attached in the repository, I tried to make it as self-contained as possible for those curious about the background and use cases.

That said, if I find some time, I’ll definitely consider putting together a separate page to outline where I think this kind of tool fits into the broader future of GS and robotics.

Thanks again!

somethingsome•7mo ago

Il very curious of that.. My mean training with ~25-30 high quality cameras takes around 20 minutes and some Gb of memory on a single GPU, what is the size of your large scale scenes? I see many possible optimizations to lower that number of Gb and time

markisus•7mo ago

I have done a recent proof of concept to generate Gaussian splats from depth cameras in real-time. The intended application is for robotics and teleoperation. I made a post on reddit [1] a while back if you're interested.

I believe the quality of realtime Gaussian splatting will improve with time. The OPs project could help ROS2 users take advantage of those new techniques. Someone might need to make a Gaussian splat video codec to bring down the bandwidth cost of streaming Gaussians.

Another application could be for visualizing your robot inside a pre-built map, or for providing visual models for known objects that the robot needs to interact with. Photometric losses could then be used to optimize the poses of these known objects.

[1] https://www.reddit.com/r/GaussianSplatting/comments/1iyz4si/...

jimmySixDOF•7mo ago

So I upload a pre-baked GSplat of the ground state physical space, presumably there is some kind of calibration, then I can navigate the ROS device spatially using the GSplat to reflect position details instead of, or in addition to, actual camera feeds ? Or are they producing the splats somehow on the ROS device with limited camera poses ? Whatever the case may be, I still think the human controller side is where Splats are more useful so add a VR headset into the loop and I think this could open up real opportunities for example spatial minimaps, decoupled points of view, etc.

shadygm•7mo ago

Thanks for taking a look!

Just to clarify, ROSplat isn’t generating the Gaussians, it’s not a SLAM algorithm or a reconstruction tool. It’s purely a visualizer that uses ROS for message passing. The idea is that if you already have a system producing Gaussians (either live or precomputed), ROSplat lets you stream and view them in real time (as the ROS messages arrive).

So in your example, yes, you could upload a pre-baked GSplat, calibrate it to the robot’s frame, and use it for navigation or visualization. Or, if your ROS device is running something like SLAM, it can publish Gaussians as it goes. In both cases, ROSplat is just making them available for visualization, nothing more.

And I completely agree with you on your last comment. VR Gaussians are the way to go, I know that a company Varjo is currently working on them. Not sure if there's anything else that's available tho :/

dheera•7mo ago

I've actually been pondering using Gaussian splats for localization, I think it could be done. The idea would be looking for the pose that minimizes the MSE in density (rather than feature points or RGB similarity which are both vulnerable to lighting changes)

jimmySixDOF•7mo ago

Varjo are good at whatever they do but also check out @gracia_vr [1] they focus on Spalts in XR and playcanvas has supersplat which lets you view immersive mode for 3DGS [2].

[1] https://www.gracia.ai/ [2] https://github.com/playcanvas/supersplat

gitroom•7mo ago

Nice, these back and forths always remind me how much cool stuff is brewing behind the scenes. Tbh I'd love seeing more live demos of things like this, helps my brain get what's really happening.

shadygm•7mo ago

Yeah I agree, lack of visuals sometimes makes it hard to tell what's happening when a field moves as fast as it does in GS. There's a github page called Awesome3DGS [1] that is updated whenever there is a new paper in GS. It helped me a lot when I was getting started.

Most papers also have their own project page that showcases their contributions or demo their project as well (:

[1] https://github.com/MrNeRF/awesome-3D-gaussian-splatting

28M Hacker News comments as vector embedding search dataset

Imgur Geo-Blocked the UK, So I Geo-Unblocked My Network

Molly: An Improved Signal App

So you wanna build a local RAG?

Effective harnesses for long-running agents

C++ Web Server on my custom hobby OS

Don't tug on that, you never know what it might be attached to (2016)

The original ABC language, Python's predecessor (1991)

Airloom – 3D Flight Tracker

True P2P Email on Top of Yggdrasil Network

JSON Schema Demystified: Dialects, Vocabularies and Metaschemas

Show HN: Pulse 2.0 – Live co-listening rooms where anyone can be a DJ

Lobsters Interview

Can Dutch universities do without Microsoft?

Show HN: Glasses to detect smart-glasses that have cameras

Show HN: An LLM-Powered Tool to Catch PCB Schematic Mistakes

Moss: a Rust Linux-compatible kernel in 26,000 lines of code

Meta hiding $27B in debt using advanced geometry

Rock Paper Scissors Solitaire

Generalizing Printf in C

Bringing Sexy Back. Internet surveillance has killed eroticism

Atuin’s New Runbook Execution Engine

Tech Titans Amass Multimillion-Dollar War Chests to Fight AI Regulation

Anti-patterns while working with LLMs

Petition to formally recognize open source work as civic service in Germany

Analog Hoverboard Controller

AI Adoption Rates Starting to Flatten Out

How I talk to whales

Apple and Intel Rumored to Partner on Mac Chips

Stellantis Is Spamming Owners' Screens with Pop-Up Ads for New Car Discounts

28M Hacker News comments as vector embedding search dataset

Imgur Geo-Blocked the UK, So I Geo-Unblocked My Network

Molly: An Improved Signal App

So you wanna build a local RAG?

Effective harnesses for long-running agents

C++ Web Server on my custom hobby OS

Don't tug on that, you never know what it might be attached to (2016)

The original ABC language, Python's predecessor (1991)

Airloom – 3D Flight Tracker

True P2P Email on Top of Yggdrasil Network

JSON Schema Demystified: Dialects, Vocabularies and Metaschemas

Show HN: Pulse 2.0 – Live co-listening rooms where anyone can be a DJ

Lobsters Interview

Can Dutch universities do without Microsoft?

Show HN: Glasses to detect smart-glasses that have cameras

Show HN: An LLM-Powered Tool to Catch PCB Schematic Mistakes

Moss: a Rust Linux-compatible kernel in 26,000 lines of code

Meta hiding $27B in debt using advanced geometry

Rock Paper Scissors Solitaire

Generalizing Printf in C

Bringing Sexy Back. Internet surveillance has killed eroticism

Atuin’s New Runbook Execution Engine

Tech Titans Amass Multimillion-Dollar War Chests to Fight AI Regulation

Anti-patterns while working with LLMs

Petition to formally recognize open source work as civic service in Germany

Analog Hoverboard Controller

AI Adoption Rates Starting to Flatten Out

How I talk to whales

Apple and Intel Rumored to Partner on Mac Chips

Stellantis Is Spamming Owners' Screens with Pop-Up Ads for New Car Discounts

Gaussian Splatting Meets ROS2

Comments