One only needs to look at GPU driven rendering and ray tracing in shaders to deduce that shader cores and memory subsystems these days have become flexible enough to do work besides lock-step uniform parallelism where the only difference was the thread ID.
Nobody strives for random access memory read patterns, but the universal popularity of buffer device address and descriptor arrays can be taken somewhat as proof that these indirections are no longer the friction for GPU architectures that they were ten years ago.
At the same time, the languages are no longer as restrictive as they once were. People are recording commands on the GPU. This kind of fiddly serial work is an indication that the ergonomics of CPU programming have less of a relative advantage, and that cuts deeply into the tradeoff costs.
IMO vendors should stop reinventing hardware video encoding and instead assign the programmer time to making libwebrtc and libvpx better suit their particular use case.
I haven't actually looked into this but it might not be the realm of possibility. But you are generating a frame on GPU, if you can also encode it there, either with nvenc or vulkan doesn't matter. Then DMA the to the nic while just using the CPU to process the packet headers, assuming that cannot also be handled in the GPU/nic
sylware•1h ago
It is usually more reasonable to work with software decoders for really complex formats, or only to accelerate some heavy parts of the decoding where data corruption is really easy to deal with or benign, or aim for the middle ground: _SIMPLE_ and _VERY CONSERVATIVE_ compute shaders.
Sometimes, the software cannot even tell the hardware is actually 'crashed' and spitting non-sense data. It goes even worse, some hardware block hot reset actually do not work and require a power cycle... Then a 'media players' able to use hardware decoding must always provide a clear and visible 'user button' in order to let this very user switch to full software decoding.
Then, there is the next step of "corruption": some streams out there are "wrong", but this "wrong" will be decoded ok on only some specific decoders and not other ones even though the format is following the same specs.
What a mess.
I hope those compute shaders are not using that abomination of glsl(or the dx one) namely are SPIR-V shaders generated with plain and simple C code.
pandaforce•6m ago
The bitstream reader in FFmpeg for Vulkan Compute codecs is copied from the C code, along with bounds checking. The code which validates whether a block is corrupt or decodable is also taken from the C version. To date, I've never got a GPU hang while using the Compute codecs.