Is this actually innovative? I respect that there’s a lot of work in making it reality and doing it specifically for AI training by modifying their algorithms. But doing portions of work in clusters that are far apart and combining them has been done many times before for non AI things, right? Or so I would think.
philipkglass•31m ago
Generically speaking, yes, this has been done before. But it can take a lot of work to transform software that works with shared memory or other low-latency interprocess communication mechanisms so that it's practical to run across wide area networks. Sometimes that's not possible at all, which is why certain problems still require "high performance computing" architectures with all of their compute nodes in the same building, connected by high-bandwidth, low-latency links.
SilverElfin•44m ago
philipkglass•31m ago