The article states that whatever the article is trying to describe "Takes about ~20-30 mins. The cognitive load is high....". while their literal actual step of "Googling "ffmpeg combine static image and audio."" gives you the literal command you need to run from a known source (superuser.com sourced from ffmpeg wiki).
Anyone even slightly familiar with ffmpeg should be able to produce the same result in minutes. For someone who doesn't understand what ffmpeg is the article means absolutely nothing. How does a "no coder" understand what a "agent in a sandboxed container" is?
we have our designer/intern in our minds who creates shorts, adds subtiles, crops them,and merges the audio generated. He is aware of ffmpeg and prefers using a SaaS UI on top of it.
However, we see him hanging out on chatgpt, or gemini all the time. He is literally the no coder we have in mind.
We just combined his type what you want + ffmpeg workflows.
He does use davinci resolve but only for 2.
NLEs make ffmpeg a standalone yet easy to use tool.
Not denying that major heavy lifting is done by the NLE. We go a step ahead and make it embeddable in a larger workflow.
Here making ffmpeg as "just another capability" allows it to be stitched together in workflows
`gst-launch-1.0 filesrc ! qt4demux ! matroskamux ! filesink...` people would be less frustrated maybe?
People would also learn a little more and be less frustrated when conversation about container/codec/colorspace etc... come up. Each have a dedicated element and you can better understand its I/O
Like the Before vs after section doesn't even seem to create the same thing, the before has no speedup, the after does.
In the end it seems they basically created a few services ("recipes") that they can reuse to do simple stuff like speed-up 2x or combine audio / video or whatever
Or you could go one step further and create a special workflow which would allow you to define some inputs and iterate with an LLM until the user gets what he wants but for this you would need to generate outputs and have the user validate what the LLM has created before finally saving the recipe.
-filter_complex_script is a thing
FFmpeg has complex syntax because it’s dealing with the _complexity of video_. I agree with everyone about knowing (and helping create or contribute to) our tools.
Today I largely forget about the _legacy_ of video, the technical challenges, and how critical it was to get it right.
There are an incredible number of output formats and considerations for _current_ screens (desktop, tablet, mobile, tv, etc…). Then we have a whole other world on the creation side for capture, edit, live broadcast…
On legacy formats it used to be so complex with standards, requirements, and evolving formats. Today, we don’t even think about why we have 29.97fps around? Interlacing?
We have a mix of so many incredible (and sometimes frustrating) codecs, needs and final outputs, so it’s really amazing the power we have with a tool like FFmpeg… It’s daunting but really well thought out.
So just a big thanks to the FFmpeg team for all their incredible work over the years…
It's dealing with 3D data (more if you count audio, other tracks) and multi-dimensional transforms from a command line.
It works 99% of the time for my use case.
sylware•3h ago