frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

A Tale of Two Standards, POSIX and Win32 (2005)

https://www.samba.org/samba/news/articles/low_point/tale_two_stds_os2.html
1•goranmoomin•49s ago•0 comments

Ask HN: Is the Downfall of SaaS Started?

1•throwaw12•1m ago•0 comments

Flirt: The Native Backend

https://blog.buenzli.dev/flirt-native-backend/
2•senekor•3m ago•0 comments

OpenAI's Latest Platform Targets Enterprise Customers

https://aibusiness.com/agentic-ai/openai-s-latest-platform-targets-enterprise-customers
1•myk-e•6m ago•0 comments

Goldman Sachs taps Anthropic's Claude to automate accounting, compliance roles

https://www.cnbc.com/2026/02/06/anthropic-goldman-sachs-ai-model-accounting.html
2•myk-e•8m ago•3 comments

Ai.com bought by Crypto.com founder for $70M in biggest-ever website name deal

https://www.ft.com/content/83488628-8dfd-4060-a7b0-71b1bb012785
1•1vuio0pswjnm7•9m ago•1 comments

Big Tech's AI Push Is Costing More Than the Moon Landing

https://www.wsj.com/tech/ai/ai-spending-tech-companies-compared-02b90046
1•1vuio0pswjnm7•11m ago•0 comments

The AI boom is causing shortages everywhere else

https://www.washingtonpost.com/technology/2026/02/07/ai-spending-economy-shortages/
1•1vuio0pswjnm7•13m ago•0 comments

Suno, AI Music, and the Bad Future [video]

https://www.youtube.com/watch?v=U8dcFhF0Dlk
1•askl•15m ago•1 comments

Ask HN: How are researchers using AlphaFold in 2026?

1•jocho12•18m ago•0 comments

Running the "Reflections on Trusting Trust" Compiler

https://spawn-queue.acm.org/doi/10.1145/3786614
1•devooops•22m ago•0 comments

Watermark API – $0.01/image, 10x cheaper than Cloudinary

https://api-production-caa8.up.railway.app/docs
1•lembergs•24m ago•1 comments

Now send your marketing campaigns directly from ChatGPT

https://www.mail-o-mail.com/
1•avallark•28m ago•1 comments

Queueing Theory v2: DORA metrics, queue-of-queues, chi-alpha-beta-sigma notation

https://github.com/joelparkerhenderson/queueing-theory
1•jph•40m ago•0 comments

Show HN: Hibana – choreography-first protocol safety for Rust

https://hibanaworks.dev/
5•o8vm•41m ago•0 comments

Haniri: A live autonomous world where AI agents survive or collapse

https://www.haniri.com
1•donangrey•42m ago•1 comments

GPT-5.3-Codex System Card [pdf]

https://cdn.openai.com/pdf/23eca107-a9b1-4d2c-b156-7deb4fbc697c/GPT-5-3-Codex-System-Card-02.pdf
1•tosh•55m ago•0 comments

Atlas: Manage your database schema as code

https://github.com/ariga/atlas
1•quectophoton•58m ago•0 comments

Geist Pixel

https://vercel.com/blog/introducing-geist-pixel
2•helloplanets•1h ago•0 comments

Show HN: MCP to get latest dependency package and tool versions

https://github.com/MShekow/package-version-check-mcp
1•mshekow•1h ago•0 comments

The better you get at something, the harder it becomes to do

https://seekingtrust.substack.com/p/improving-at-writing-made-me-almost
2•FinnLobsien•1h ago•0 comments

Show HN: WP Float – Archive WordPress blogs to free static hosting

https://wpfloat.netlify.app/
1•zizoulegrande•1h ago•0 comments

Show HN: I Hacked My Family's Meal Planning with an App

https://mealjar.app
1•melvinzammit•1h ago•0 comments

Sony BMG copy protection rootkit scandal

https://en.wikipedia.org/wiki/Sony_BMG_copy_protection_rootkit_scandal
2•basilikum•1h ago•0 comments

The Future of Systems

https://novlabs.ai/mission/
2•tekbog•1h ago•1 comments

NASA now allowing astronauts to bring their smartphones on space missions

https://twitter.com/NASAAdmin/status/2019259382962307393
2•gbugniot•1h ago•0 comments

Claude Code Is the Inflection Point

https://newsletter.semianalysis.com/p/claude-code-is-the-inflection-point
4•throwaw12•1h ago•2 comments

Show HN: MicroClaw – Agentic AI Assistant for Telegram, Built in Rust

https://github.com/microclaw/microclaw
1•everettjf•1h ago•2 comments

Show HN: Omni-BLAS – 4x faster matrix multiplication via Monte Carlo sampling

https://github.com/AleatorAI/OMNI-BLAS
1•LowSpecEng•1h ago•1 comments

The AI-Ready Software Developer: Conclusion – Same Game, Different Dice

https://codemanship.wordpress.com/2026/01/05/the-ai-ready-software-developer-conclusion-same-game...
1•lifeisstillgood•1h ago•0 comments
Open in hackernews

Edge264 – Minimalist, high-performance software decoder for H.264/AVC video

https://github.com/tvlabs/edge264
160•andsoitis•4mo ago

Comments

ebb_earl_co•4mo ago
I like the `VARIANTS` env. var [0] to take advantage of x86_64 newer extensions if one’s processor has them.

CachyOS is a whole distro compiled with these flags, if possible, which is appealing.

[0] https://github.com/tvlabs/edge264#compiling-and-testing

kimixa•4mo ago
I wonder why they use multiple executables instead of something like function multiversioning [0]

[0] https://gcc.gnu.org/onlinedocs/gcc/Function-Multiversioning....

kg•4mo ago
Function multiversioning would require indirect jumps/indirect calls, wouldn't it? Separate executables can do static jumps/calls.
kimixa•4mo ago
On linux it uses IFUNC resolved at load/dynamic relocation time, so at runtime it's the same cost as any other (relocatable) function call. But they're "static" in that it's not a calculated address so pretty easy for a superscaler CPU to follow.

So it does have some limitations like not being inlined, same as any other external function.

eru•4mo ago
What about duplicating the entire executable essentially a few times, and jumping to the right version at the very beginning of execution?

You have bigger binaries, but the logistics are simplified compared to shipping multiple binaries and you should get the same speed as multiple binaries with fully inlined code.

Since they don't seem to be doing that, my question is: what's the caveat I'm missing? (Or are the bigger binaries enough of a caveat by themselves?)

mikepurvis•4mo ago
Ideally you only need to duplicate until you hit the first not-inlined function call; at that point there’s nothing gained and it’s just a waste of binary size.
astrange•4mo ago
There's no need to do any of that, a table of function pointers to DSP functions works fine.

It can be useful to duplicate the entire code for 8-bit vs 10-bit pixels because that does affect nearly everything.

amluto•4mo ago
Since TEXTREL is basically gone these days (for good reasons!), IFUNC is the same as any other call that is relocatable to a target not in the same DSO. Which is either a GOT or PLT, either of which ends up being an indirect call (or branch if the compiler feels like it and the PLT isn’t involved). Which is what the person you’re replying to said :)

A relocatable call within the same DSO can be a PC-relative relocation, which is not a relocation at all when you load the DSO and ends up as a plain PC-relative branch or call.

kimixa•4mo ago
Sure, but they're already paying that cost for every non-static function anyway. Any DSO, or executable that allows function interposition, already pays.

Ideally you should just multiversion the topmost exported symbol, everything below that should either directly inlined, or, as the architecture variant is known statically by the compiler, variants and a direct call generated. I know at least GCC can do this variant generation for things like constant propagation over static function boundaries, so /assume/ it can do the same for other optimization variants like this, but admittedly haven't checked.

URScrewed13•4mo ago
Kenny green
pjmlp•4mo ago
To keep code portable?
nnevatie•4mo ago
Portable multi-versioning is kind of hard to set up. E.g. compilers on Linux are not happy to emit AVX512 intrinsics when the architecture isn't enabled via -m... - this is also true for the case where you're trying to setup a dispatching system relying on cpuid, etc.
Sesse__•4mo ago
Is this specific to AVX512? It works well for e.g. AVX2.
nnevatie•4mo ago
Yes, at least on AVX512 the compiler will throw a fit on trying to use intrinsics in case you haven't enabled TU-global architecture with options.
Sesse__•4mo ago
Seems to work fine for me: https://gcc.godbolt.org/z/hPexshjoa
nnevatie•4mo ago
Likely a different compiler/version. GCC had this error for me recently:

error: inlining failed in call to 'always_inline' 'float _mm512_reduce_add_ps(__m512)': target specific option mismatch

Sesse__•4mo ago
Compiler Explorer link or it didn't happen? :-)
wyattblue•4mo ago
This may eventually be better for people working in the cloud. Shame there's no apple silicon support.

(See also Cisco's openh264, which supports decoding)

zamadatix•4mo ago
Don't all Apple Silicon devices have extremely good (in both speed and feature coverage) H.264 hardware decoders already?
sroussey•4mo ago
Yes, H.264 is in hardware on Apple Silicon.

But as a software decoder which is specifically made to not use hardware APIs for decoding, I am not sure why they skipped ARM64 on non-linux platforms.

metadat•4mo ago
What if there were some intelligence to test-for and auto-switch to support extensions when available? If you specify it manually it already supports x64-specific instructions via the ${VARIANTS} env var.

https://github.com/tvlabs/edge264/blob/5a3c19fc0ccacb03f9841...

zamadatix•4mo ago
But why go through the trouble of building and shipping a software decoder for a platform you know has no devices which need such a thing? On the other hand it's not too hard to find ARM64 Linux devices which need an efficient software decoder (either because there isn't a hardware one at all, there one that is there is limited in feature support, or the one that is there is hybrid but written so poorly a good software decoder is more efficient).
mikepurvis•4mo ago
Out of curiosity, what does “in hardware” actually mean in this context? Is it pure vhdl? Microcode that leverages special primitives? Something else?
bri3d•4mo ago
In the case of Apple AVD, it's a multi-stage system with a bunch of special primitives, orchestrated by a Cortex-M3 with firmware. Codec-specific frontends emit IR which a less specialized backend can execute.

https://github.com/eiln/avd

This really heavily depends on the device, though. There are all sorts of "hardware" video decoders ranging from fairly generic vector coprocessors running firmware to "pure" HDL/VLSI level implementations. Usually on more modern or advanced hardware you'll see more and more become more general purpose, since a lot of the later stages can be shared across codecs, saving area vs. a pure hardware implementation.

antihero•4mo ago
Do we have hardware H.265 or other more current codec support in hardware on anything?
galad87•4mo ago
Yes, almost everything out there supports at least H.265, H.264, VP9, and AV1 in hardware.
CharlesW•4mo ago
Yes. If there's a hole in macOS's VideoToolbox support it's the middling quality of their hardware-accelerated encoder, so people who want high quality encodes will generally use x264/x265 for that.
bri3d•4mo ago
I had no issues getting this to build, pass tests, and render a video on ARM64 Mac OS X.
astrange•4mo ago
I don't see why this would support Linux arm64 but not macOS.

Anyway, you can just use libavcodec, which is faster (because of frame based multithreading) and doesn't operate on the mistaken belief that it's a good idea to use SIMD intrinsics.

fisf•4mo ago
Care to ellaborate? Not shitting on libavcodec here, I would also guess it just beats a new project on raw performance.

But according to the repo, this project also uses both slice and frame multi-threading (as does ffmpeg, with all the tradeoffs).

And SIMD usage is basically table-stakes, and libavcodec uses SIMD all over the place?

astrange•4mo ago
> But according to the repo, this project also uses both slice and frame multi-threading (as does ffmpeg, with all the tradeoffs).

Oh, I missed that since it doesn't have a separate file. In that case they're likely very similar performance-wise. H.264 wasn't well-designed for CPUs because the arithmetic coding could've been done better, but it's not that challenging these days.

> And SIMD usage is basically table-stakes, and libavcodec uses SIMD all over the place?

SIMD _intrinsics_. libavcodec doesn't write DSP functions in assembly for historical reasons - it's because it's just better! It's faster, just as maintainable, at least as easy to read and write, and not any less portable (since it already isn't portable…). They're basically a poor way to generate the code you want, interfere with other optimizations like autovectorization, and you might as well write your own code generator instead.

The downsides are it's harder to debug and analyzers like ASan don't work.

wyattblue•4mo ago
That's a good catch. I did not notice there weren't using any assembly at all. I assumed they wanted to add NEON before advertising Apple Silicon but I guess not.

Also, hi FFmpeg twitter.

jackedEngineer•4mo ago
Talk by one of the authors - https://archive.fosdem.org/2025/schedule/event/fosdem-2025-5...
PaywallBuster•4mo ago
slides https://archive.fosdem.org/2025/events/attachments/fosdem-20...
userbinator•4mo ago
It's a big omission to claim "minimalist" but then have no information about code size. Nonetheless, as someone who has written an H.261 through H.263 decoder as a learning exercise, it's good to see more people writing video codecs. Getting high performance may not be straightforward, but the algorithms themselves are well-defined by the standard.

Access to left/top macroblock values is done with direct offsets in memory instead of copying their values to a buffer beforehand.

I made use of this technique too, so I think it's not particularly novel nor non-obvious. The performance-sensitivity of video decoding necessarily means avoiding any extraneous data movement whenever possible.

Also worth noting: H.264 patents have already expired in most of the world: https://meta.wikimedia.org/wiki/Have_the_patents_for_H.264_M...

jwr•4mo ago
From someone who has worked on H.264 decoding and done some assembly optimization: this is an insanely complex task and a huge effort. Kudos to the author(s).
giorgioz•4mo ago
What are the use cases for this? How can I use this instead of existing decidere and save money?