Also, it has RDMA. Last I checked, Ray did not support RDMA.
There are probably other differences as well, but the lack of RDMA immediately splits the world into things you can do with ray and things you cannot do with ray
https://pytorch.org/blog/pytorch-foundation-welcomes-ray-to-...
As far as things that might be a performance loss here, one thing I'm wondering is if custom kernels are supported. I'm also wondering how much granularity of control there is with communication between different actors calling a function. Overall, I really like this project and hope to see it used over multi-controller setups.
Yeah, you might end up needing some changes to remote worker initialization, but you can generally bake in whatever kernels and other system code you need.
In case someone that can fix this is reading here
Found a few typo's. The em dash makes me suspect an LLM was involved in proofreading
- Is this similar to openMPI?
- How is a mesh established? Do they need to be on the same host?
pjmlp•3h ago
> Monarch is split into a Python-based frontend, and a backend implemented in Rust.
Other than that, looks like a quite interesting project.
galangalalgol•2h ago
gaogao•1h ago
dhrt12327•1h ago
It's a pity they don't do a complete rewrite with a functional language as the driver.
gaogao•1h ago
It's open source, so seeing such an extension would be quite cool. There's much that could be done with native Rust actors and code that get maybe at what you want, but nothing precludes mixing PyTorch and other backends.
For example, you could wrap a C++ inference engine as part of one of the actors generating data for other actors doing distributed training.
pjmlp•1h ago