Community, All the HN belong to you. This is an archive of hacker news that fits in your browser. When I made HN Made of Primes I realized I could probably do this offline sqlite/wasm thing with the whole GBs of archive. The whole dataset. So I tried it, and this is it. Have Hacker News on your device.
Go to this repo (https://github.com/DOSAYGO-STUDIO/HackerBook): you can download it. Big Query -> ETL -> npx serve docs - that's it. 20 years of HN arguments and beauty, can be yours forever. So they'll never die. Ever. It's the unkillable static archive of HN and it's your hands. That's my Year End gift to you all. Thank you for a wonderful year, have happy and wonderful 2026. make something of it.
carbocation•30m ago
That repo is throwing up a 404 for me.
Question - did you consider tradeoffs between duckdb (or other columnar stores) and SQLite?
keepamovin•21m ago
No, I just went straight to sqlite. What is duckdb?
cess11•7m ago
It is very similar to SQLite in that it can run in-process and store its data as a file.
It's different in that it is tailored to analytics, among other things storage is columnar, and it can run off some common data analytics file formats.
fsiefken•2m ago
DuckDB is an open-source column-oriented Relational Database Management System (RDBMS). It's designed to provide high performance on complex queries against large databases in embedded configuration.
"DICT FSST (Dictionary FSST) represents a hybrid compression technique that combines the benefits of Dictionary Encoding with the string-level compression capabilities of FSST.
This approach was implemented and integrated into DuckDB as part of ongoing efforts to optimize string storage and processing performance."
https://homepages.cwi.nl/~boncz/msc/2025-YanLannaAlexandre.p...
linhns•20m ago
Not the author here. I’m not sure about DuckDB, but SQLite allows you to simply use a file as a database and for archiving, it’s really helpful. One file, that’s it.
cobolcomesback•14m ago
DuckDB does as well. A super simplified explanation of duckdb is that it’s sqlite but columnar, and so is better for analytics of large datasets.
formerly_proven•5m ago
The schema is this: items(id INTEGER PRIMARY KEY, type TEXT, time INTEGER, by TEXT, title TEXT, text TEXT, url TEXT
Doesn't scream columnar database to me.
embedding-shape•2m ago
At a glance, that is missing (at least) a `parent` or `parent_id` attribute which items in HN can have (and you kind of need if you want to render comments), see http://hn.algolia.com/api/v1/items/46436741
3eb7988a1663•17m ago
While I suspect DuckDB would compress better, given the ubiquity of SQLite, it seems a fine standard choice.
wslh•23m ago
Is this updated regularly? 404 on GitHub as the other comment.
With all due respect it would be great if there is an official HN public dump available (and not requiring stuff such as BigQuery which is expensive).
yupyupyups•16m ago
1 hour passed and it's already nuked?
Thank you btw
asdefghyk•1h ago
How much space is needed? ...for the data ....
Im wondering if it would work on a tablet? ....
keepamovin•2h ago
Go to this repo (https://github.com/DOSAYGO-STUDIO/HackerBook): you can download it. Big Query -> ETL -> npx serve docs - that's it. 20 years of HN arguments and beauty, can be yours forever. So they'll never die. Ever. It's the unkillable static archive of HN and it's your hands. That's my Year End gift to you all. Thank you for a wonderful year, have happy and wonderful 2026. make something of it.
carbocation•30m ago
Question - did you consider tradeoffs between duckdb (or other columnar stores) and SQLite?
keepamovin•21m ago
cess11•7m ago
It's different in that it is tailored to analytics, among other things storage is columnar, and it can run off some common data analytics file formats.
fsiefken•2m ago
It has transparent compression built-in and has support for natural language queries. https://buckenhofer.com/2025/11/agentic-ai-with-duckdb-and-s...
"DICT FSST (Dictionary FSST) represents a hybrid compression technique that combines the benefits of Dictionary Encoding with the string-level compression capabilities of FSST. This approach was implemented and integrated into DuckDB as part of ongoing efforts to optimize string storage and processing performance." https://homepages.cwi.nl/~boncz/msc/2025-YanLannaAlexandre.p...
linhns•20m ago
cobolcomesback•14m ago
formerly_proven•5m ago
Doesn't scream columnar database to me.
embedding-shape•2m ago
3eb7988a1663•17m ago
wslh•23m ago
With all due respect it would be great if there is an official HN public dump available (and not requiring stuff such as BigQuery which is expensive).
yupyupyups•16m ago
Thank you btw