frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

NYC Mayoral Inauguration bans Raspberry Pi and Flipper Zero alongside explosives

https://blog.adafruit.com/2025/12/30/nyc-mayoral-inauguration-bans-raspberry-pi-and-flipper-zero-...
34•ptorrone•43m ago•9 comments

FediMeteo: A €4 FreeBSD VPS Became a Global Weather Service

https://it-notes.dragas.net/2025/02/26/fedimeteo-how-a-tiny-freebsd-vps-became-a-global-weather-s...
164•birdculture•3h ago•41 comments

A faster heart for F-Droid. Our new server is here

https://f-droid.org/2025/12/30/a-faster-heart-for-f-droid.html
172•kasabali•4h ago•73 comments

Show HN: 22 GB of Hacker News in SQLite

https://hackerbook.dosaygo.com
237•keepamovin•6h ago•77 comments

A Vulnerability in Libsodium

https://00f.net/2025/12/30/libsodium-vulnerability/
153•raggi•5h ago•15 comments

Zpdf: PDF text extraction in Zig – 5x faster than MuPDF

https://github.com/Lulzx/zpdf
85•lulzx•3h ago•33 comments

Electrolysis can solve one of our biggest contamination problems

https://ethz.ch/en/news-and-events/eth-news/news/2025/11/electrolysis-can-solve-one-of-our-bigges...
106•PaulHoule•5h ago•15 comments

Toro: Deploy Applications as Unikernels

https://github.com/torokernel/torokernel
108•ignoramous•6h ago•88 comments

Honey's Dieselgate: Detecting and tricking testers

https://vptdigital.com/blog/honey-detecting-testers/
38•AkshatJ27•1h ago•6 comments

Project ideas to appreciate the art of programming

https://codecrafters.io/blog/programming-project-ideas
4•vitaelabitur•24m ago•0 comments

Loss32: Let's Build a Win32/Linux

https://loss32.org/
162•akka47•1d ago•271 comments

OpenAI's cash burn will be one of the big bubble questions of 2026

https://www.economist.com/leaders/2025/12/30/openais-cash-burn-will-be-one-of-the-big-bubble-ques...
62•1vuio0pswjnm7•1h ago•55 comments

Everything as code: How we manage our company in one monorepo

https://www.kasava.dev/blog/everything-as-code-monorepo
152•benbeingbin•3h ago•138 comments

Professional software developers don't vibe, they control

https://arxiv.org/abs/2512.14012
79•dpflan•3h ago•100 comments

Reverse Engineering a Mysterious UDP Stream in My Hotel (2016)

https://www.gkbrk.com/hotel-music
156•bayesnet•1w ago•22 comments

Non-Zero-Sum Games

https://nonzerosum.games/
300•8organicbits•11h ago•160 comments

The British empire's resilient subsea telegraph network

https://subseacables.blogspot.com/2025/12/the-british-empires-resilient-subsea.html
150•giuliomagnifico•10h ago•39 comments

Escaping containment: A security analysis of FreeBSD jails [video]

https://media.ccc.de/v/39c3-escaping-containment-a-security-analysis-of-freebsd-jails
22•todsacerdoti•3h ago•0 comments

Approachable Swift Concurrency

https://fuckingapproachableswiftconcurrency.com/en/
145•wrxd•10h ago•59 comments

Igniting the GPU: From Kernel Plumbing to 3D Rendering on RISC-V

https://mwilczynski.dev/posts/riscv-gpu-zink/
56•michalwilczynsk•9h ago•7 comments

Times New American: A Tale of Two Fonts

https://hsu.cy/2025/12/times-new-american/
197•firexcy•10h ago•125 comments

U.S. cybersecurity experts plead guilty for ransomware attacks

https://www.tomshardware.com/tech-industry/cyber-security/u-s-cybersecurity-experts-plead-guilty-...
43•robotnikman•1h ago•6 comments

Hive (YC S14) Is Hiring a Staff Software Engineer (Data Systems)

https://jobs.ashbyhq.com/hive.co/cb0dc490-0e32-4734-8d91-8b56a31ed497
1•patman_h•8h ago

Sabotaging Bitcoin

https://blog.dshr.org/2025/12/sabotaging-bitcoin.html
14•zdw•2h ago•0 comments

Postgres extension complements pgvector for performance and scale

https://github.com/timescale/pgvectorscale
103•flyaway123•6d ago•22 comments

Braid Math Article

https://mathvoices.ams.org/mathmedia/tonys-take-april-2022/
4•marysminefnuf•1w ago•0 comments

Go away Python

https://lorentz.app/blog-item.html?id=go-shebang
314•baalimago•14h ago•309 comments

HTTP Strict Transport Security (HSTS)

https://hstspreload.org/
30•arunc•1d ago•20 comments

What Happened to Abit Motherboards

https://dfarq.homeip.net/what-happened-to-abit-motherboards/
61•zdw•8h ago•50 comments

Netflix Open Content

https://opencontent.netflix.com/
561•tosh•13h ago•111 comments
Open in hackernews

Zpdf: PDF text extraction in Zig – 5x faster than MuPDF

https://github.com/Lulzx/zpdf
85•lulzx•3h ago

Comments

lulzx•3h ago
I built a PDF text extraction library in Zig that's significantly faster than MuPDF for text extraction workloads.

~41K pages/sec peak throughput.

Key choices: memory-mapped I/O, SIMD string search, parallel page extraction, streaming output. Handles CID fonts, incremental updates, all common compression filters.

~5,000 lines, no dependencies, compiles in <2s.

Why it's fast:

  - Memory-mapped file I/O (no read syscalls)
  - Zero-copy parsing where possible
  - SIMD-accelerated string search for finding PDF structures
  - Parallel extraction across pages using Zig's thread pool
  - Streaming output (no intermediate allocations for extracted text)
What it handles:

  - XRef tables and streams (PDF 1.5+)
  - Incremental PDF updates (/Prev chain)
  - FlateDecode, ASCII85, LZW, RunLength decompression
  - Font encodings: WinAnsi, MacRoman, ToUnicode CMap
  - CID fonts (Type0, Identity-H/V, UTF-16BE with surrogate pairs)
tveita•1h ago
What kind of performance are you seeing with/without SIMD enabled?

From https://github.com/Lulzx/zpdf/blob/main/src/main.zig it looks like the help text cites an unimplemented "-j" option to enable multiple threads.

There is a "--parallel" option, but that is only implemented for the "bench" command.

lulzx•1h ago
I have now made parallel by default and added an option to enable multiple threads.

I haven't tested without SIMD.

cheshire_cat•1h ago
You've released quite a few projects lately, very impressive.

Are you using LLMs for parts of the coding?

What's your work flow when approaching a new project like this?

littlestymaar•1h ago
> Are you using LLMs for parts of the coding?

I can't talk about the code, but the readme and commit messages are most likely LLM-generated.

And when you take into account that the first commit happened just three hours ago, it feels like the entire project has been vibe coded.

Neywiny•47m ago
Hard disagree. Initial commit was 6k LOC. Author could've spent years before committing. Ill advised but not impossible.
littlestymaar•27m ago
Why would you make Claude write your commit message for a commit you've spent years working on though?
Neywiny•15m ago
1. Be not good at or a fan of git when coding

2. Be not good at or a fan of git when committing

Not sure what the disconnect is.

Now if it were vibecoded, I wouldn't be surprised. But benefit of the doubt

lulzx•1h ago
Claude Code.
jeffbee•1h ago
What's fast about mmap?
jonstewart•44m ago
What’s the fidelity like compared to tika?
lulzx•32m ago
The accuracy difference is marginal (1-2%) but the speed difference is massive.
agentifysh•1h ago
excellent stuff what makes zig so fast
observationist•1h ago
Not being slow - they compile straight to bytecode, they aren't interpreted, and have aggressive, opinionated optimizations baked in by default, so it's even faster than compiled c (under default conditions.)

Contrasted with python, which is interpreted, has a clunky runtime, minimal optimizations, and all sorts of choices that result in slow, redundant, and also slow, performance.

The price for performance is safety checks, redundancy, how badly wrong things can go, and so on.

A good compromise is luajit - you get some of the same aggressive optimizations, but in an interpreted language, with better-than-c performance but interpreted language convenience, access to low level things that can explode just as spectacularly as with zig or c, but also a beautiful language.

agentifysh•1h ago
will add this to the list, now learning new languages is less of a barrier with LLMs
Zambyte•32m ago
Zig is safer than C under default conditions, not faster. By default does a lot of illegal behavior safety checking, such as array and slice bounds checking, numeric overflow checking, and invalid union access checking. These features are disabled by certain (non default) build modes, or explicitly disabled at a per scope level.

It may be easier to write code that runs faster in Zig than in C under similar build optimization levels, because writing high performance C code looks a lot like writing idiomatic Zig code. The Zig standard library offers a lot of structures like hash maps, SIMD primitives, and allocators with different performance characteristics to better fit a given use-case. C application code often skips on these things simply because it is a lot more friction to do in C than in Zig.

AndyKelley•1h ago
It makes your development workflow smooth enough that you have the time and energy to do stuff like all the bullet points listed in https://news.ycombinator.com/item?id=46437289
forgotpwd16•10m ago
>you have the time and energy to do stuff like all the bullet points listed

Don't disagree but in specific case, per the author, project was made via Claude Code. Although could as well be that Zig is better as LLM target. Noticed many new vibe projects decide to use Zig as target.

mpeg•1h ago
very nice, it'd be good to see a feature comparison as when I use mupdf it's not really just about speed, but about the level of support of all kinds of obscure pdf features, and good level of accuracy of the built-in algorithms for things like handling two-column pages, identifying paragraphs, etc.

the licensing is a huge blocker for using mupdf in non-OSS tools, so it's very nice to see this is MIT

python bindings would be good too

lulzx•1h ago
added a comparison, will improve further. https://github.com/Lulzx/zpdf?tab=readme-ov-file#comparison-...

also, added python bindings.

mpeg•53m ago
thanks, claude, I guess haha

as others have commented, I think while this is a nice portfolio piece, I would worry about its longevity as a vibe coded project

chanbam•3m ago
If he made something legitimately useful, who cares how?
odie5533•1h ago
Now we just need Python bindings so I can use it in my trash language of choice.
lulzx•1h ago
added python bindings!
hiq•43m ago
Were you working on it already, or did it take you less than 17 minutes to commit https://github.com/Lulzx/zpdf/commit/9f5a7b70eb4b53672c0e4d8... ?
littlestymaar•1h ago
- First commit 3hours ago.

- commit message: LLM-generated.

- README: LLM-generated.

I'm not convinced that projects vibe coded over the evening deserve the HN front page…

Edit: and of course the author's blog is also full of AI slop…

2026 hasn't even started I already hate it.

kingkongjaffa•1h ago
Wait, but why?

If it's really better than what we had before, what does it matter how it was made? It's literally hacked together with the tools of the day (LLMs) isn't that the very hacker ethos? Patching stuff together that works in a new and useful way.

5x speed improvements on pdf text extraction might be great for some applications I'm not aware of, I wouldn't just dismiss it out of hand because the author used $robot to write the code.

Presumably the thought to make the thing in the first place and decide what features to add and not add was more important than how the code is generated?

dmytrish•13m ago
...and it does not work. I tried it on ~10 random pdfs, including very simple ones (e.g. a hello world from typst), it segfaults on every single one.
forgotpwd16•6m ago
Tried few and works. Maybe you've older or newer Zig version than whatever project targets. (Mine is 0.15.2.)
forgotpwd16•17m ago

  74910,74912c187768,187779
  < [Example 1: If you want to use the code conversion facetcodecvt_utf8to output tocouta UTF-8 multibyte sequence
  < corresponding to a wide string, but you don't want to alter the locale forcout, you can write something like:\237 D.27.21954
                                                                                                                                \251ISO/IECN4950wstring_convert<std::codecvt_utf8<wchar_t>> myconv;
  < std::string mbstring = myconv.to_bytes\050L"Hello\134n"\051;
  ---
  >
  > [Example 1: If you want to use the code conversion facet codecvt_utf8 to output to cout a UTF-8 multibyte sequence
  > corresponding to a wide string, but you don’t want to alter the locale for cout, you can write something like:
  >
  > § D.27.2
  > 1954
  >
  > © ISO/IEC
  > N4950
  >
  > wstring_convert<std::codecvt_utf8<wchar_t>> myconv;
  > std::string mbstring = myconv.to_bytes(L"Hello\n");
Is indeed faster but output is messier. And doesn't handle Unicode in contrast to mutool that does. (Probably also explains the big speed boost.)
lulzx•7m ago
will fix.
TZubiri•3m ago
Lol, but there's 100 competitors in the PDF text extraction space, some are multi million dollar industries: AWS textract, ABBY PDFreader, PDFBox, I think you may be underestimating the challenge here.
TZubiri•4m ago
In my experience with parsing PDFs, speed has never been an issue, it has always been a matter of quality.