frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Video Encoding and Decoding with Vulkan Compute Shaders in FFmpeg

https://www.khronos.org/blog/video-encoding-and-decoding-with-vulkan-compute-shaders-in-ffmpeg
41•y1n0•3d ago

Comments

sylware•1h ago
Well, the problem with hardware decoding is it cannot handle all the variations in data corruption which results in hardware crash, sometimes not recoverable with a soft reset of the hardware block.

It is usually more reasonable to work with software decoders for really complex formats, or only to accelerate some heavy parts of the decoding where data corruption is really easy to deal with or benign, or aim for the middle ground: _SIMPLE_ and _VERY CONSERVATIVE_ compute shaders.

Sometimes, the software cannot even tell the hardware is actually 'crashed' and spitting non-sense data. It goes even worse, some hardware block hot reset actually do not work and require a power cycle... Then a 'media players' able to use hardware decoding must always provide a clear and visible 'user button' in order to let this very user switch to full software decoding.

Then, there is the next step of "corruption": some streams out there are "wrong", but this "wrong" will be decoded ok on only some specific decoders and not other ones even though the format is following the same specs.

What a mess.

I hope those compute shaders are not using that abomination of glsl(or the dx one) namely are SPIR-V shaders generated with plain and simple C code.

pandaforce•6m ago
These are all gripes you might have with Vulkan Video. Unlike with Vulkan Video, in Compute, bounds checking is the norm. Overreading a regular buffer will not result in a GPU hang or crash. If you use pointers, it will, but if you use pointers, its up to you to check if overreads can happen.

The bitstream reader in FFmpeg for Vulkan Compute codecs is copied from the C code, along with bounds checking. The code which validates whether a block is corrupt or decodable is also taken from the C version. To date, I've never got a GPU hang while using the Compute codecs.

positron26•34m ago
> Most popular codecs were designed decades ago, when video resolutions were far smaller. As resolutions have exploded, those fixed-size minimum units now represent a much smaller fraction of a frame — which means far more of them can be processed in parallel. Modern GPUs have also gained features enabling cross-invocation communication, opening up further optimization opportunities.

One only needs to look at GPU driven rendering and ray tracing in shaders to deduce that shader cores and memory subsystems these days have become flexible enough to do work besides lock-step uniform parallelism where the only difference was the thread ID.

Nobody strives for random access memory read patterns, but the universal popularity of buffer device address and descriptor arrays can be taken somewhat as proof that these indirections are no longer the friction for GPU architectures that they were ten years ago.

At the same time, the languages are no longer as restrictive as they once were. People are recording commands on the GPU. This kind of fiddly serial work is an indication that the ergonomics of CPU programming have less of a relative advantage, and that cuts deeply into the tradeoff costs.

doctorpangloss•21m ago
What is the use case? Okay, ultra low latency streaming. That is good. But. If you are sending the frames via some protocol over the network, like WebRTC, it will be touching the CPU anyway. Software encoding of 4K h264 is real time on a single thread on 65w, decade old CPUs, with low latency. The CPU encoders are much better quality and more flexible. So it's very difficult to justify the level of complexity needed for hardware video encoding. Absolutely no need for it for TV streaming for example. But people keep being obsessed with it who have no need for it.

IMO vendors should stop reinventing hardware video encoding and instead assign the programmer time to making libwebrtc and libvpx better suit their particular use case.

eptcyka•18m ago
It will be more energy efficient. And the CPU is free to jit half a gig of javascript in the mean time.
temp0826•6m ago
It's hugely more efficient, if you're on a battery powered device it could mean hours more of play time. It's pretty insane just how much better it is (I go through a bit of extra effort to make sure it's working for me, hw decoding isn't includes in some distros).
xattt•16m ago
It’s a leftover mindset from the mid-2000s when GPGPU became possible, and additional performance was “unlocked” from an otherwise under-utilized silicon.
jpc0•12m ago
I'm not entirely sure that this is true.

I haven't actually looked into this but it might not be the realm of possibility. But you are generating a frame on GPU, if you can also encode it there, either with nvenc or vulkan doesn't matter. Then DMA the to the nic while just using the CPU to process the packet headers, assuming that cannot also be handled in the GPU/nic

chillfox•5m ago
The article explains it. This is not for streaming over the web, but for editing professional grade video on consumer hardware.
pandaforce•3m ago
The article explicitly mentions that mainstream codecs like H264 are not the target. This is for very high bitrate high resolution professional codecs.

I'm OK being left behind, thanks

https://shkspr.mobi/blog/2026/03/im-ok-being-left-behind-thanks/
87•coinfused•20m ago•26 comments

ArXiv Declares Independence from Cornell

https://www.science.org/content/article/arxiv-pioneering-preprint-server-declares-independence-co...
471•bookstore-romeo•9h ago•159 comments

Entso-E final report on Iberian 2025 blackout

https://www.entsoe.eu/publications/blackout/28-april-2025-iberian-blackout/
69•Rygian•2h ago•15 comments

Video Encoding and Decoding with Vulkan Compute Shaders in FFmpeg

https://www.khronos.org/blog/video-encoding-and-decoding-with-vulkan-compute-shaders-in-ffmpeg
42•y1n0•3d ago•10 comments

Flash-KMeans: Fast and Memory-Efficient Exact K-Means

https://arxiv.org/abs/2603.09229
83•matt_d•3d ago•4 comments

Schizophrenia study finds new biomarker, drug candidate to treat symptoms

https://news.northwestern.edu/stories/2026/03/schizophrenia-study-finds-new-biomarker-drug-candid...
6•gmays•41m ago•0 comments

The Soul of a Pedicab Driver

https://www.sheldonbrown.com/pedicab.html
72•haritha-j•4h ago•20 comments

Google details new 24-hour process to sideload unverified Android apps

https://arstechnica.com/gadgets/2026/03/google-details-new-24-hour-process-to-sideload-unverified...
987•0xedb•20h ago•1056 comments

Regex Blaster

https://mdp.github.io/regex-blaster/
13•mdp•2d ago•3 comments

Just Put It on a Map

https://progressandpoverty.substack.com/p/just-put-it-on-a-map
40•surprisetalk•4d ago•20 comments

Drawvg Filter for FFmpeg

https://ayosec.github.io/ffmpeg-drawvg/
114•nolta•2d ago•21 comments

Full Disclosure: A Third (and Fourth) Azure Sign-In Log Bypass Found

https://trustedsec.com/blog/full-disclosure-a-third-and-fourth-azure-sign-in-log-bypass-found
215•nyxgeek•12h ago•57 comments

Show HN: Sonar – A tiny CLI to see and kill whatever's running on localhost

https://github.com/RasKrebs/sonar
38•raskrebs•3h ago•19 comments

Too Much Color

https://www.keithcirkel.co.uk/too-much-color/
60•maguay•2d ago•28 comments

Drugwars for the TI-82/83/83 Calculators (2011)

https://gist.github.com/mattmanning/1002653/b7a1e88479a10eaae3bd5298b8b2c86e16fb4404
200•robotnikman•13h ago•61 comments

Building a Reader for the Smallest Hard Drive

https://www.willwhang.dev/Reading-MK4001MTD/
66•voctor•4d ago•19 comments

Push events into a running session with channels

https://code.claude.com/docs/en/channels
365•jasonjmcghee•13h ago•214 comments

Return of the Obra Dinn: spherical mapped dithering for a 1bpp first-person game

https://forums.tigsource.com/index.php?topic=40832.msg1363742#msg1363742
423•PaulHoule•3d ago•53 comments

Show HN: Three new Kitten TTS models – smallest less than 25MB

https://github.com/KittenML/KittenTTS
480•rohan_joshi•22h ago•163 comments

Cursor Composer 2 is just Kimi K2.5 with RL

https://twitter.com/fynnso/status/2034706304875602030
171•mirzap•4h ago•77 comments

How the Turner twins are mythbusting modern technical apparel

https://www.carryology.com/insights/how-the-turner-twins-are-mythbusting-modern-gear/
284•greedo•2d ago•146 comments

HP realizes that mandatory 15-minute support call wait times isn't good support

https://arstechnica.com/gadgets/2025/02/misguided-hp-customer-support-approach-included-forced-15...
14•felineflock•35m ago•2 comments

FSF statement on copyright infringement lawsuit Bartz v. Anthropic

https://www.fsf.org/blogs/licensing/2026-anthropic-settlement
156•m463•3d ago•72 comments

4Chan mocks £520k fine for UK online safety breaches

https://www.bbc.com/news/articles/c624330lg1ko
418•mosura•23h ago•759 comments

Astral to Join OpenAI

https://astral.sh/blog/openai
1405•ibraheemdev•1d ago•857 comments

Cockpit is a web-based graphical interface for servers

https://github.com/cockpit-project/cockpit
291•modinfo•17h ago•166 comments

Noq: n0's new QUIC implementation in Rust

https://www.iroh.computer/blog/noq-announcement
230•od0•19h ago•35 comments

Delphi 13.1 Released, with ARM64 support

https://blogs.embarcadero.com/announcing-the-availability-of-rad-studio-13-florence-update-1/
32•nopakos•2h ago•15 comments

France's aircraft carrier located in real time by Le Monde through fitness app

https://www.lemonde.fr/en/international/article/2026/03/20/stravaleaks-france-s-aircraft-carrier-...
12•MrDresden•57m ago•6 comments

Scaling Karpathy's Autoresearch: What Happens When the Agent Gets a GPU Cluster

https://blog.skypilot.co/scaling-autoresearch/
207•hopechong•21h ago•88 comments