frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
591•klaussilveira•11h ago•173 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
897•xnx•16h ago•544 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
93•matheusalmeida•1d ago•22 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
20•helloplanets•4d ago•13 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
27•videotopia•4d ago•0 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
201•isitcontent•11h ago•24 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
199•dmpetrov•11h ago•91 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
312•vecti•13h ago•136 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
353•aktau•18h ago•176 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
354•ostacke•17h ago•92 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
23•romes•4d ago•3 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
458•todsacerdoti•19h ago•229 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
7•bikenaga•3d ago•1 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
80•quibono•4d ago•18 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
258•eljojo•14h ago•155 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
391•lstoll•17h ago•264 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
53•kmm•4d ago•3 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
231•i5heu•14h ago•177 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
122•SerCe•7h ago•101 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
45•gfortaine•9h ago•13 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
136•vmatsiiako•16h ago•59 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
68•phreda4•11h ago•12 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
271•surprisetalk•3d ago•37 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
25•gmays•6h ago•7 comments

Zlob.h 100% POSIX and glibc compatible globbing lib that is faste and better

https://github.com/dmtrKovalenko/zlob
13•neogoose•4h ago•8 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1043•cdrnsf•20h ago•431 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
171•limoce•3d ago•91 comments

FORTH? Really!?

https://rescrv.net/w/2026/02/06/associative
60•rescrv•19h ago•22 comments

Show HN: Smooth CLI – Token-efficient browser for AI agents

https://docs.smooth.sh/cli/overview
89•antves•1d ago•66 comments

Show HN: ARM64 Android Dev Kit

https://github.com/denuoweb/ARM64-ADK
14•denuoweb•1d ago•2 comments
Open in hackernews

Understanding Deflate

https://jjrscott.com/to-deflate-or-not/
83•ingve•4mo ago

Comments

lifthrasiir•4mo ago
Not illustrated here, but the vast majority of deflated data uses a dynamic Huffman code which compresses the Huffman tree itself with another (fixed) Huffman tree. This sort of nested compression is in fact moderately popular; JPEG XL for example uses the same technique.
Dylan16807•4mo ago
> That said, even in this simple case, decoding by hand was a pain.

Well doing it by hand at this level is mostly decoding quasi-ASCII by hand, and the compression isn't making it significantly more painful.

If you reorder the static huffman table and byte align things, the encoded data could look more like:

  T O B E O R N O T T 0x85 0x09 T 0x88 0x0F 0xFF
So short lengths become 0x80 + length and short distances encode as themselves.

This version is pretty easy to decode by hand, and still 16 bytes. I'm arguably cheating by removing the 3 bit header, but even 17 bytes would be good.

Though the encoder deciding to repeat those Ts doesn't make much sense, shouldn't this be compressed to literal("TOBEORNOT") copy(6,9) copy(9,15)?

ack_complete•4mo ago
Many Deflate encoders, including zlib/gzip, are greedy encoders that work forwards, either for speed or to support streaming compression. They encode runs as they are found scanning forward, with limited lookahead to try to allow longer matches to preempt immediately preceding shorter matches. There is an "optimal parse" strategy that maximizes the runs found by processing the entire file backwards.

If you repack the plaintext using zopfli, it does encode the text as you suggest.

Dylan16807•4mo ago
Being greedy is fine here, though. It's the exact same match but somehow cut one byte short.

Also I just tested something. If you stick a random extra letter onto the start of the string, the mistake goes away and the output shrinks by a byte. Is it possibly an issue with finding matches that start at byte 0...?

More testing: Removing the Ts that fail to match to get TOBEORNOTOBEOROBEORNOT shrinks the output by 2 bytes. Changing the first character to ZOBEORNOTOBEOROBEORNOT stays at the same shrunken size. Then removing that Z to get OBEORNOTOBEOROBEORNOT makes it balloon back up as now it fails to match the O twice.

If I take the original string and start prepending unique letters, the first one shrinks and fixes the matching, and then each subsequent letter adds one byte. So it's not that matching needs some particular alignment, it's only failing to match the very first byte.

A new test: XPPPPPPPP appears to encode as X, P, then a copy command. And PPPPPPPPP encodes as P, P, then a copy. Super wrong.

ack_complete•4mo ago
Interesting, I can also reproduce this. I wonder if it's an artifact of zlib's sliding window setup. The odd part is that if I try various libs with advzip, both the libdeflate and 7z modes show the same artifact, only the zopfli mode is able to avoid it. Doesn't seem to be a format violation as gzip -t doesn't complain about a copy at position 0.
zmodem•4mo ago
> Though the encoder deciding to repeat those Ts doesn't make much sense, shouldn't this be compressed to literal("TOBEORNOT") copy(6,9) copy(9,15)?

IIRC zlib won't consider matches at offset 0: https://github.com/madler/zlib/blob/5a82f71ed1dfc0bec044d970...

userbinator•4mo ago
but I think the bits must be the opposite "endian" here, ie 2 instead of one

I remember asking the zlib developers many years ago why DEFLATE's bit order was unusual in that it made table-based decoding slightly more complex than necessary, and was told that only Phil Katz knew.

Also of note is that LHA/LZH uses an extremely similar data format where Huffman is also used to encode both literals and lengths in one "alphabet": http://justsolve.archiveteam.org/wiki/LHA Given that LHA very slightly predated ZIP, I suspect the latter was developed based on the former's algorithms.