frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
593•klaussilveira•11h ago•176 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
901•xnx•17h ago•545 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
22•helloplanets•4d ago•16 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
95•matheusalmeida•1d ago•22 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
28•videotopia•4d ago•0 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
203•isitcontent•11h ago•24 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
199•dmpetrov•12h ago•91 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
313•vecti•13h ago•137 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
353•aktau•18h ago•176 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
355•ostacke•17h ago•92 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
459•todsacerdoti•19h ago•231 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
24•romes•4d ago•3 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
259•eljojo•14h ago•155 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
80•quibono•4d ago•18 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
7•bikenaga•3d ago•1 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
392•lstoll•18h ago•266 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
53•kmm•4d ago•3 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
234•i5heu•14h ago•178 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
46•gfortaine•9h ago•13 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
122•SerCe•7h ago•103 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
136•vmatsiiako•16h ago•60 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
68•phreda4•11h ago•12 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
271•surprisetalk•3d ago•37 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
25•gmays•6h ago•7 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1044•cdrnsf•21h ago•431 comments

Zlob.h 100% POSIX and glibc compatible globbing lib that is faste and better

https://github.com/dmtrKovalenko/zlob
13•neogoose•4h ago•9 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
171•limoce•3d ago•91 comments

FORTH? Really!?

https://rescrv.net/w/2026/02/06/associative
60•rescrv•19h ago•22 comments

Show HN: Smooth CLI – Token-efficient browser for AI agents

https://docs.smooth.sh/cli/overview
89•antves•1d ago•66 comments

WebView performance significantly slower than PWA

https://issues.chromium.org/issues/40817676
27•denysonique•8h ago•5 comments
Open in hackernews

How/why to sweep async tasks under a Postgres table

https://taylor.town/pg-task
102•ostler•2mo ago

Comments

koolba•2mo ago
The article says:

> Never Handroll Your Own Two-Phase Commit

And then buried at the end:

> A few notable features of this snippet:

> Limiting number of retries makes the code more complicated, but definitely worth it for user-facing side-effects like emails.

This isn't two-phase commit. This is lock the DB indefinitely while remote system is processing and pray we don't crash saving the transaction after it completes. That locked also eats up a database connection so your concurrency is limited by the size of your DB pool.

More importantly, if the email sends but the transaction to update the task status fails, it will try again. And again. Forever. If you're going to track retries it would have to be before you start the attempt. Otherwise the "update the attempts count" logic itself could fail and lead to more retries.

The real answer to all this is to use a provider that supports idempotency keys. Then when you can retry the action repeatedly without it actually happening again. My favorite article on this subject: https://brandur.org/idempotency-keys

maxmcd•2mo ago
Just that row should be locked since it's: "for update skip locked".

I agree the concurrency limitation is kind of rough, but it's kind of elegant because you don't have to implement some kind of timeout/retry thing. You're certainly still exposed to the possibility of double-sending, so yes, probably much nicer to update the row to "processing" and re-process those rows on a timeout.

morshu9001•2mo ago
Idempotency is key. Stripe is good about that.
surprisetalk•2mo ago
Author here! Posting from phone while traveling so sorry for bad formatting.

It was outside of the scope of this essay, but a lot of these problems can be resolved with a mid-transaction COMMIT and reasonable timeouts

You can implement a lean idempotency system within the task pattern like this, but it really depends on what you need and what failures you want to prevent

Thanks for providing more context and safety tips! :)

tracker1•2mo ago
For similar systems I've worked on, I'll use a process id, try/tries and time started as part of the process plucking an item of a db queue table... this way I can have something that resets anything started over N minutes prior that didn't finish, for whatever reason (handling abandoned, broken tasks that are in an unknown state.

One reason to do this for emails, IE a database queue is to keep a log/history of all sent emails, as well as a render for "view in browser" links in the email itself. Not to mention those rare instances where an email platform goes down and everything starts blowing up.

greener_grass•2mo ago
> The real answer to all this is to use a provider that supports idempotency keys. Then when you can retry the action repeatedly without it actually happening again. My favorite article on this subject: https://brandur.org/idempotency-keys

Turtles all the way down?

Let's say you are the provider that must support idempotency keys? How should it be done?

nothrabannosir•2mo ago
Offer 99.something% guaranteed exactly-once-delivery. Compete on number of nines. Charge appropriately.
lelanthran•2mo ago
> This isn't two-phase commit.

Agreed

> This is lock the DB indefinitely while remote system is processing and pray we don't crash saving the transaction after it completes.

I don't really see the problem here (maybe due to my node.js skillz being less than excellent), because I don't see how it's locking the table; that one row would get locked, though.

Tostino•2mo ago
You are right. It is just a row level lock... But that doesn't change the fact you are explicitly choosing to use long running transactions, which adds to table bloat and eats active connections to your DB as mentioned. It also hurts things like reindexing.

I prefer an optimistic locking solution for this type of thing.

lelanthran•2mo ago
> But that doesn't change the fact you are explicitly choosing to use long running transactions, which adds to table bloat and eats active connections to your DB as mentioned.

TBH, I'm not too worried about that either - my understanding from the content is that you'd have a few tasks running in the background that service the queue; even one is enough.

I'd just consider that to be always-active, and turn the knobs accordingly.

> It also hurts things like reindexing.

I dunno about this one - does re-indexing on the queue occur often enough to matter? After all, this is a queue of items to process. I'd be surprised if it needed to be re-indexed at all.

Tostino•2mo ago
To start off, I said optimistic locking before and I actually meant pessimistic locking.

But I think it totally depends on what your queue is used for. In my case, I need durable queues that report status/errors and support retries/back off.

I deal with it using updates rather than deleting from the queue, because I need a log of what happened for audit purposes. If I need to optimize later, I can easily partition the table. At the start, I just use a partial index for the items to be processed.

Reindexing, and other maintenance functions that need to rewrite the table will happen more than you like in a production system, so I'd rather make them easy to do.

lelanthran•2mo ago
> But I think it totally depends on what your queue is used for.

Agreed

> I deal with it using updates rather than deleting from the queue, because I need a log of what happened for audit purposes. If I need to optimize later, I can easily partition the table. At the start, I just use a partial index for the items to be processed.

That's a good approach. I actually have this type of queue in production, and my need is similar to yours, but the expected load is a lot less - there's an error if the application goes through a day and sees even a few thousand work items added to the queue (this queue is used for user notifications, and even very large clients have only a few thousand users).

So, my approach is to have a retry column that decrements to zero each time I retry a work item, with items having a zero in the retry column getting ignored.

The one worker runs periodically (currently every 1m) and processes only those rows with a non-zero retry column and with the `incomplete` flag set.

A different worker runs every 10m and moves expired rows (those with the retry column set to zero) and completed rows (those with a column set to `done` or similar) to a different table for audit/logging purposes. This is why I said that the table containing workitems will almost never be reindexed: all rows added will eventually be removed.

------------------------------------------------

The real problem is that the processing cannot be done atomically, even when there is only a single worker.

For example, if the processing is "send email", your system might go down after calling the `send_email()` function in your code and before calling the `decrement_retry()` in the code.

No amount of locking, n-phase commits, etc can ever prevent the case where the email is sent but the retry counter is not decremented. This is not a solvable problem so I am prepared to live with it for now with the understanding that the odds are low that this situation will come up, and if it does the impact is lows as well (user gets two notification emails for the same item).

Tostino•2mo ago
I said optimistic in my post above... I really meant pessimistic locking. Just wanted to clarify and couldn't edit original comment.
stack_framer•2mo ago
> I like slim and stupid servers, where each endpoint wraps a very dumb DB query.

I thought I was alone in this, but I build all my personal projects this way! I wish I could use this approach at work, but too many colleagues crave "engineering."

stronglikedan•2mo ago
Doesn't that make for exponentially more requests to get the same data, or possibly more data than you really need (bloated responses)?
never_inline•2mo ago
Some people really overdo HTTP verbs /GET, /POST, /PUT, /DELETE and leave much work to frontend. Irks me a lot.

But then again, there's GraphQL because frontend developers thought backend developers are anti social.

dragonwriter•2mo ago
> Some people really overdo HTTP verbs /GET, /POST, /PUT, /DELETE and leave much work to frontend. Irks me a lot.

If I understand you correctly, I don't think of it as overdoing HTTP verbs so much as using an excessively naive mapping between HTTP resources and base table entities.

rictic•2mo ago
Missing from the article: how to communicate progress and failure to the user?

This is much more complicated with task queues. Doable still! But often skipped, because it's tempting to imagine that the backend will just handle the failure by retrying. But there are lots of kinds of failure that can happen.

The recipient's server doesn't accept the email. The recipient's domain name expired. Actually, we don't have an email address for that recipient at all.

The user has seen "got it, will do, don't worry about it" but if that email is time sensitive, they might want to know that it hasn't been sent yet, and maybe they should place a phone call instead.

nrhrjrjrjtntbt•2mo ago
You can still do that. You can poll status while the page is open. Toast errors or state changes. Even toast them on return. Anything is possible.

After all Amazon does this for a physical order. It may be days before a status update!

rgbrgb•2mo ago
If you're in TS/JS land, I like to use an open source version of this called graphile-worker [0].

[0]: https://worker.graphile.org

damidekronik•2mo ago
I am using pgboss myself, very decent, very simple. Had some issues with graphile back in the days, cant remember what exaclty, it probably did already overcome whatever I was struggling with!
morshu9001•2mo ago
Never do RPCs during an xact like this! Fastest way to lock up your DB. I don't even mean at large scale. I've been forced many times to set up two-phase commit. That way you also get more flexibility and visibility into what it's doing.
efxhoy•2mo ago
I like it! We have a service with a similar postgres task queue but we use an insert trigger on the tasks table that does NOTIFY and the worker runs LISTEN, it feels a bit tidier than polling IMO.
surprisetalk•2mo ago
LISTEN/NOTIFY works great but they don’t have any mechanism for ACKs or retries so it’s got some tradeoffs to consider. Works great when you’re willing to sacrifice some durability!
parthdesai•2mo ago
Feels tidier till it becomes a bottleneck:

https://www.recall.ai/blog/postgres-listen-notify-does-not-s...

nullzzz•2mo ago
I can recommend this architecture. So much easier to maintain and understand than using an extra service. The implementation here I didn’t go into much detail, but you can surely roll your own if this doesn’t cut it for you, or use a library like pgboss.
arjie•2mo ago
The sophisticated solution to this problem is Temporal, but yes, I also use an async task queue frequently because it's very easy to roll one's own.
bsaul•2mo ago
Recently had a conversation with a coworker, which thought temporal was too complex for a first implementation. However after looking at the documentation it seems that the tech is very approachable. What makes using temporal complex ?
arjie•2mo ago
Perhaps he found the amount of infra required to use it substantial. You can run an async task queue on SQLite with a loop in any language. Temporal has an app sdk etc etc.

I do not agree that it is complex but that’s what I’d hypothesize as why someone would think that.

zanellato19•2mo ago
I feel like this is basically what the Rails world does. Sidekiq handles a lot of this for you and it's honestly an amazing tool.

It does rely on redis, but it's basically the same idea.

owenmakes•2mo ago
In Rails 8 you have SolidQueue by default, which doesn't rely on redis
arkh•2mo ago
Same thing with Symfony and its Messenger component when setup to use a database.
claytongulick•2mo ago
At a brief scan of the code, is there a bug with the way task rows are selected and rolled back?

It looks like multiple task rows are being retrieved via a delete...returning statement, and for each row an email being sent. If there's an error, the delete statement is rolled back.

Let's hypothesize that a batch of ten tasks are retrieved, and the 9th has a bad email address, so the batch gets rolled back on error. Next retry the welcome email would be sent again for the ones that succeeded, right?

Even marking the task as "processed" with the tx in the task code wouldn't work, because that update statement would also get rolled back.

Am I missing something? (entirely possible, the code is written in "clever" style that makes it harder for me to understand, especially the bit where $sql(usr_id) is passed into the sql template before it's been returned, is there CTE magic there that I don't understand?)

I thought that this was the reason that most systems like this only pull one row at a time from the queue with skip locked...

Thanks to anyone who can clarify this for me if it is indeed correct!

w23j•2mo ago
I mean the inner select uses "limit 1", right? So it will usually (but not always as I said in another comment) only delete and return a single task.
claytongulick•2mo ago
Hmm, yep I didn't see that, thanks!

It's a confusing way to do do things to me, like, why not select ordered by task date limit 1? Still using for update and skip locked etc... hold the transaction, and update to 'complete' or delete/move the row when done? What's the advantage of the inner select like that?

And I'm still totally confused by:

    const [{ usr_id } = { usr_id: null }] = await sql`
        with usr_ as (
          insert into usr (email, password)
          values (${email}, crypt(${password}, gen_salt('bf')))
          returning *
        ), task_ as (
          insert into task (task_type, params)
          values ('SEND_EMAIL_WELCOME', ${sql({ usr_id })})
        )
        select * from usr_
      `;
This looks to me like usr_id would always be null?

I think the idea is great, I think I'm just struggling a bit with the code style, it seems to be "clever" in a way that increases cognitive load without a real benefit that I can see, but I suppose that's pretty subjective.

w23j•2mo ago
We use a similar approach.

Fun fact: A query like this will, once in a blue moon, return more than limit (here 1) row, since the inner query is executed multiple times and returns different ids, which is surprising for a lot of people. If your code does not expect that, it may cause problems. (The article seems to, since it uses a list and iteration to handle the result.)

  delete from task
  where task_id in
  ( select task_id
    from task
    order by random() -- use tablesample for better performance
    for update
    skip locked
    limit 1
  )
  returning task_id, task_type, params::jsonb as params 
You can avoid that by using a materialized Common Table Expression. https://stackoverflow.com/questions/73966670/select-for-upda...

Also, if your tasks take a long time, it will create long-running transactions, which may cause dead tuple problems. If you need to avoid that, you can mark the task as "running" in a short-lived transaction and delete it in another. It becomes more complicated then, since you need to handle the case that your application dies while it has taken a task.

adamzwasserman•2mo ago
Fair warning: don't bounce off the first paragraph like I did. "Dumb queries" made me think the author was arguing against SQL sophistication — I was halfway through composing a rebuttal about stored procedures and keeping logic in the database before I actually read the rest.

Turns out the article advocates exactly that. The example uses CTEs with multi-table inserts. "Dumb" here means "no synchronous external service calls," not "avoid complex SQL."

fastest963•2mo ago
It can be more complicated depending on your environment but I'd prefer instead to use a Pub/Sub pattern instead. You can have a generic topic if you want to model it like the generic table but it handles retries, metrics, scaling, long-running, etc all for you. If you are running in the cloud then Pub/Sub is something you don't need to maintain and will scale better than this solution. You also won't need to VACUUM the Pub/Sub solution ;)