frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Google Vertex AI information disclosure incident

https://docs.cloud.google.com/vertex-ai/generative-ai/docs/security-bulletins
2•k1w1•2m ago•0 comments

Ntfsplus: NTFS Filesystem Remake

https://lore.kernel.org/all/20251020020749.5522-1-linkinjeon@kernel.org/
1•JamesCoyne•7m ago•0 comments

Anti-science bills hit statehouses, strip away 100y of public health protections

https://globalnews.ca/news/11487389/anti-science-bills-hit-statehouses-stripping-away-public-heal...
2•WarOnPrivacy•8m ago•0 comments

Next.js 16

https://nextjs.org/blog/next-16
2•FBISurveillance•8m ago•0 comments

Ruby on Rails 8.1 Released

https://github.com/rails/rails/releases/tag/v8.1.0
1•FBISurveillance•12m ago•0 comments

More Than 100 Cases of Measles Reported in Utah and Arizona in Latest Outbreak

https://www.nytimes.com/2025/10/21/well/more-than-100-cases-of-measles-reported-in-utah-and-arizo...
4•WarOnPrivacy•13m ago•2 comments

Vision Pro with M5 [video]

https://www.youtube.com/watch?v=EOg0Ryig3rQ
1•fadedsignal•14m ago•0 comments

Eric Lu Wins International Chopin Piano Competition

https://www.nytimes.com/2025/10/21/arts/music/chopin-piano-competition-eric-lu.html
2•laserson•16m ago•0 comments

AmigaOS 3.3 will be released in 2026

https://www.amiga-news.de/en/news/AN-2025-10-00103-EN.html
1•doener•17m ago•0 comments

The hottest term in AI is completely made up

https://www.washingtonpost.com/technology/2025/10/21/nvidia-ai-factories/
3•reaperducer•18m ago•1 comments

PageIndex Chat – Human-Like Long Document AI Analyst

https://pageindex.ai/blog/pageindex-chat
1•mingtianzhang•18m ago•1 comments

Using Async Functions in Celery with Django Connection Pooling

https://mrdonbrown.blogspot.com/2025/10/using-async-functions-in-celery-with.html
1•ipeev•21m ago•0 comments

Thinking Sparks: Emergent Attention Heads in Reasoning Models

https://arxiv.org/abs/2509.25758
1•diwank•25m ago•0 comments

Modern AI on Vintage Hardware: Llama 2 Runs on Windows 98

https://hackaday.com/2025/01/13/modern-ai-on-vintage-hardware-llama-2-runs-on-windows-98/
1•JumpCrisscross•28m ago•1 comments

Amazon Plans to Replace More Than Half a Million Jobs with Robots

https://www.nytimes.com/2025/10/21/technology/inside-amazons-plans-to-replace-workers-with-robots...
2•bookofjoe•37m ago•2 comments

Show HN: ProfiTree, Tax Optimization Tool for DIY Investors

https://www.profitree-tax.com/
1•shahakshat609•38m ago•0 comments

Large Language Models Inference Engines Based on Spiking Neural Networks

https://arxiv.org/abs/2510.00133
2•PaulHoule•39m ago•0 comments

Effects of Swallowable Intragastric Balloon on Weight Loss, Metabolic Syndrome (2017)

https://www.gavinpublishers.com/article/view/effect-of-a-new-swallowableintragastric-balloon-elip...
1•nateb2022•40m ago•0 comments

Surfacing LLM Biases Through Graffiti

https://nullpxl.com/post/surfacing-llm-biases-through-graffiti/
2•nullpxl•40m ago•0 comments

Dangerous and invisible worm found in Visual Studio Code extensions

https://www.heise.de/en/news/Dangerous-and-invisible-worm-found-in-Visual-Studio-Code-extensions-...
3•croes•42m ago•1 comments

Ask HN: Codex / Claude Code vs. Cursor?

2•mholubowski•42m ago•0 comments

Daniel J. Bernstein updated cdb (Constant database) to go beyond 4GB

https://cdb.cr.yp.to/
5•kreco•43m ago•0 comments

Did people in the 90s worry about the efficiency of the internet

3•burgiee•43m ago•2 comments

GitHub Copilot's "Free Plan Limit" Bug: A Year-Long Oversight?

https://danielraffel.me/2025/10/22/github-copilots-free-plan-limit-bug-a-year-long-oversight/
1•atupem•45m ago•0 comments

DHS Asks OpenAI to Unmask User Behind ChatGPT Prompts, Possibly First Such Case

https://gizmodo.com/dhs-asks-openai-to-unmask-user-behind-chatgpt-prompts-possibly-the-first-such...
4•mrtesthah•47m ago•0 comments

Show HN: Streaky – GitHub Streak Monitor with Distributed Cron Processing

https://github.com/0xReLogic/Streaky
1•0xrelogic•48m ago•0 comments

Fork Buckets Like You Fork Code

https://www.tigrisdata.com/blog/fork-buckets-like-code/
1•raoufchebri•48m ago•0 comments

A ritual and the toxic effects of ranking

https://mailchi.mp/f6f9b751ce8c/resilience-postcard-lonely-1669813
1•pcfwik•51m ago•0 comments

Pathom 3 – a Clojure library modelling information systems as attribute graphs

https://pathom3.wsscode.com/
1•Tevo•53m ago•0 comments

Rematch Accelerated by Network Next

https://mas-bandwidth.com/rematch-accelerated-by-network-next/
2•gafferongames•57m ago•0 comments
Open in hackernews

Build your own database

https://www.nan.fyi/database
351•nansdotio•8h ago
Source material: Designing Data-Intensive Applications https://www.oreilly.com/library/view/designing-data-intensiv...

Comments

4ndrewl•6h ago
> Databases were made to solve one problem:

>

> "How do we store data persistently and then efficiently look it up later?"

Isn't that two problems?

dayjaby•6h ago
Store data persistently so it can be looked up efficiently* sounds like a single problem.
SirFatty•6h ago
Definitely two.
cjbgkagh•6h ago
It’s not persistent if it can’t be recovered later
stvltvs•6h ago
Puts message in a bottle and tosses into the most convenient black hole.
BetaDeltaAlpha•5h ago
Doesn't the black hole compresses the bottle beyond recovery?
stvltvs•4h ago
Not necessarily, opinions vary.

https://www.sciencenewstoday.org/do-black-holes-destroy-or-s...

SahAssar•6h ago
"Store data persistently" implies "it can be looked up" since if you cannot look it up it is impossible to know if it is stored persistently.

The "efficiently" part can be considered a separate problem though.

prerok•5h ago
Well, if you just want to store data, you can use files. Lookup is a bit tedious and inefficient.

So, if we consider that persistent storage is a solved problem, then we can say that the reason for databases was how to look up data efficiently. In fact, that is why they were invented, even if persistent storage is a prerequisite.

nonethewiser•5h ago
How about "store data in certain way." That sounds more like 1 problem and encompasses an even larger problem space.
grokgrok•6h ago
How do we reconstruct past memory states? That's the fundamental problem.

Efficiency of storage or retrieval, reliability against loss or corruption, security against unwanted disclosure or modification are all common concerns, and the relative values assigned to these features and others motivate database design.

kiitos•5h ago
> How do we reconstruct past memory states? That's the fundamental problem.

reconstructing past memory states is rarely, if ever, a requirement that needs to be accommodated in the database layer

nonethewiser•5h ago
Can you elaborate? That certainly seems to be what happens in a typical crud app. You have some model for your data which you persist so that it can be loaded later. Perhaps partially at times.

In another context perhaps you're ingesting data to be used in analytics. Which seems to fit the "reconstruct past memory stat" less.

i_k_k•6h ago
I always wanted to ship a write-only database. Lightning fast.
elygre•6h ago
Back in the 80s a professor at our college got a presentation on the concept of «write-only memory» accepted for some symposium.

Good times.

thomasjudge•1h ago
Very secure!
pcdevils•5h ago
Pretty much how eventstoredb works. Deleting data fully only happens at scavenge which rewrites the data files.
hxtk•5h ago
I think it was a joke. It sounds like you read it as append-only, like most LSM tree databases (not rewriting files in the course of write operations), but I think GP meant it as write-only to the exclusion of reads, roughly equivalent to `echo $data > /dev/null`
datadrivenangel•4h ago
I've forgotten how to count that low. [0]

0 - https://www.youtube.com/watch?v=3t6L-FlfeaI

archerx•5h ago
That would be useful for logging.
warkdarrior•5h ago
If it's write-only, and no reads ever happen, one can write to /dev/null without loss of utility.
mewpmewp2•5h ago
It would be good for before going to sleep then.
Etheryte•4h ago
Also useful for backups, so long as you don't need to restore.
pratik661•5h ago
This is analogous to an elevator that’s unidirectional
rzzzt•5h ago
One that lets people enter. We will figure out exiting later, with exiting on a different floor as a stretch goal.
theideaofcoffee•4h ago
Or just a paternoster
nonethewiser•5h ago
It's amusing to me that this is really quite a pedantic observation yet it's driving very earnest engagement from hackernews. Myself included. Absolutely nothing in this article is riding on if its 1 or 2 problems - it's an aside at best. Yet I'm still trying to think through if it's 1 or 2. I mean, the "and" is right there - that clearly suggests two. It's almost comical even, to say "Here is one problem: X and Y." Yet in another way it seems like 2 sides of the same coin.

I guess there is a rather fine line between philosophy and pedantry.

Maybe we can think about it from another angle. If they are 2 problems databases were designed to solve, then that means this is a problem databases were designed to solve: storing data persistently.

Is that really a problem database were designed to solve? Not really. We had that long before databases. It was already solved. It's a pretty fundamental computer operation. Isn't it fair to say this is one thing? "Storing data so it can be retrieved efficiently."

gingersnap•4h ago
You're thinking of regex
mrighele•4h ago
It is a single problem that contains two smaller problems, but the actual hard part (a third problem, if you wish) is putting them together. If you limit yourself to solve those two problems independently you won't have a (useful) database.
didip•4h ago
Off by 1 error is indeed a hard problem.
whartung•4h ago
> Isn't that two problems?

No, that would be regexes.

mamcx•2h ago
You can decompose in 2 problems, because well is better, but is in fact one. Can be argued that is only this single problem:

How, in ACID way, store data that will be efficiently look it up later by a unknown number of clients and unknown access patterns, concurrently, without blocking all the participants, in a fast way?

And then add SQL (ouch!)

cube2222•5h ago
I clicked through a couple of the articles in the OP, and I must say, the design and animations are extremely pretty!

Kudos for that!

235ylkj•5h ago
Here's a simple key-value store inspired by D.B. Cooper:

  ~/bin/cooper-db-set
  ===================
  #! /bin/bash

  key="$1"
  value="$2"

  echo "${key}:${value}" >> /dev/null


  ~/bin/cooper-db-get
  ===================
  #! /bin/bash

  key="$1"
  </dev/null awk -F: -v key="$key" '$1 == key {result = $2} END {print result}'
MathMonkeyMan•5h ago
/dev/null is persistent across restarts and cache friendly, so it's got you covered.
skeptrune•5h ago
I love the design and examples in this post. Easy to read for sure.

Exercises like this also seem fun in general. It's a real test of how much you know to start anything from scratch.

kevinqi•5h ago
my only minor critique is using lorem ipsum examples. It tends to make me want to gloss over instead of reading; I prefer seeing realistic data. other than that, it's a really cool post
WD-42•4h ago
Was going to post the same thing. Lorem Ipsum makes the data too hard to distinguish. I get that due to the dynamic nature of the examples the text needed to be generated, but Latin isn't the best choice IMO.

Otherwise great article, thank you!

doublerabbit•1h ago
It's the same for me when foo and bar are used as examples.
ashleyn•5h ago
I was tempted to knee-jerk dismiss this as "don't write your own database, don't even use a KV database, just use SQL". And then I remembered the only reason I'd say this is because I went through designing my own DB or using KV databases just to avoid SQL...only to realise i was badly reinventing SQL. It could be worth the lesson.
FpUser•5h ago
>Problem. How do we store data persistently and then efficiently look it up later?"

I would say without transactions it is not a database yet from a practical standpoint.

dangoodmanUT•5h ago
I think a lot of databases would disagree
FpUser•4h ago
You might be on to something here ;)
alecco•4h ago
But they are web scale!
myth_drannon•4h ago
I also recommend this free online book to build a database https://build-your-own.org/database/
bionsystem•3h ago
I remember an article here, maybe a year ago, where somebody showed some database concepts from bash examples (like "write your db in bash"), but I can't find it anywhere, does anybody have it ?
DiabloD3•4h ago
It looks like it got hugged to death already.
winrid•4h ago
Needs a faster database
keybored•4h ago
Part of the reason why I'm not a "maker" is because my mind gets ahead of me with all the things that I would need to do in order to do things properly. So the article starts out interesting and then gets more and more, well, not exactly stressful but I get a bit weary by it.

Not that I would aspire to implement a general-purpose database. But even smaller tasks can make my mind spin too much.

browningstreet•4h ago
I don't disagree with your take in general, but I do think it's different reading about minutiae than being invested in it. If you actually are curing these requirements it's probably quite engaging. If not, the eyes and mind start to gloss over them.

As a different example: I'm moving this week. I've known I'm moving for a while. Thinking about moving -- and all the little things I have to do -- is way more painful than doing them. Thinking about them keeps me up at night, getting through my list today is only fractionally painful.

I'm also leveling up a few aspects of my "way of living" in the midst of all this, and it'd be terribly boring to tell others about it, but when next Monday comes.. it'll be quite sweet indeed.

keybored•4h ago
> As a different example: I'm moving this week. I've known I'm moving for a while. Thinking about moving -- and all the little things I have to do -- is way more painful than doing them. Thinking about them keeps me up at night, getting through my list today is only fractionally painful.

this sounds familiar... :)

nawgz•3h ago
Have you considered if you have ADHD?
chrisallick•4h ago
if author is reading, can you add an rss feed to your site? i want to add to feedly.
constantcrying•4h ago
I absolutely love this "first principles" approach of explaining a topic. You can really go through this and at each time understand what problem needs to be solved and what other problems this introduces, until you get at a reasonably satisfying solution.
exdeejay_•4h ago
The first example in the "Sorting in Practice" section appears to be broken. The text makes it seem like the list should be sorted in-memory and then written to disk sorted, but the example un-sorts the list when it's written to disk.

Edit: the flush example (2nd one) in the recap section does the same thing, when the text says that the records are supposed to be written to the file in sorted order.

0xb0565e486•4h ago
I have spending the last ~4 weeks writing a triple store!

I wish this came out earlier, there are a few insights in there that took me me a while to understand :)

saxelsen•4h ago
Nice interactivity, but this is taken straight from the Designing Data-Intensive Applications. Literally all the content here is an interactive version of chapter 3.

Maybe give credit?

tomhow•44m ago
Thanks, we've added this to the thread's top text.
vladpowerman•4h ago
Great read. I’ve been modeling developer activity as a time series key value system where each developer is a key and commits are values. Faced the same issues: logs grow fast, indexes get heavy, range queries slow down. How do you decide what to drop when compacting segments? Balancing freshness and retention is tricky.
orliesaurus•3h ago
am i the only one who IS a huge fan of this blogpost layout
jumploops•3h ago
“LSM trees are the underlying data structure used for [..] DynamoDB, and they have proven to perform really well at scale [..] 80 million requests per second!”

This is a tad bit misleading, as the LSM is used for the node-level storage engine, but doesn’t explain how the overall distributed system scales to 80 million rps.

iirc the original Dynamo paper used BerkeleyDB (b-tree or LSM), but the 2012 paper shifted to a fully LSM-based engine.