frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
591•klaussilveira•11h ago•170 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
897•xnx•16h ago•544 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
93•matheusalmeida•1d ago•22 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
20•helloplanets•4d ago•13 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
27•videotopia•4d ago•0 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
200•isitcontent•11h ago•24 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
199•dmpetrov•11h ago•91 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
312•vecti•13h ago•136 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
353•aktau•17h ago•176 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
354•ostacke•17h ago•92 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
22•romes•4d ago•3 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
458•todsacerdoti•19h ago•229 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
7•bikenaga•3d ago•1 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
80•quibono•4d ago•18 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
257•eljojo•14h ago•154 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
53•kmm•4d ago•3 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
390•lstoll•17h ago•263 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
231•i5heu•14h ago•177 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
120•SerCe•7h ago•100 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
136•vmatsiiako•16h ago•59 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
68•phreda4•10h ago•12 comments

Zlob.h 100% POSIX and glibc compatible globbing lib that is faste and better

https://github.com/dmtrKovalenko/zlob
13•neogoose•4h ago•8 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
25•gmays•6h ago•7 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
44•gfortaine•9h ago•13 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
271•surprisetalk•3d ago•37 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1043•cdrnsf•20h ago•431 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
171•limoce•3d ago•90 comments

FORTH? Really!?

https://rescrv.net/w/2026/02/06/associative
60•rescrv•19h ago•22 comments

Show HN: Smooth CLI – Token-efficient browser for AI agents

https://docs.smooth.sh/cli/overview
89•antves•1d ago•64 comments

Show HN: ARM64 Android Dev Kit

https://github.com/denuoweb/ARM64-ADK
14•denuoweb•1d ago•2 comments
Open in hackernews

When Sigterm Does Nothing: A Postgres Mystery

https://clickhouse.com/blog/sigterm-postgres-mystery
115•saisrirampur•6mo ago

Comments

gsliepen•6mo ago

  pg_usleep(1000L);
Virtually any time you put a fixed sleep in your program, it's going to be wrong. It is either too long or too short, and even if it is perfect, there is no guarantee your program will actually sleep for exactly the requested amount of time. A condition variable or something similar would be the right thing to use here.

Of course, if this code path really is only taken very infrequently, you can get away with it. But assumptions are not always right, it's better to make the code robust in all cases.

delusional•6mo ago
> there is no guarantee your program will actually sleep for exactly the requested amount of time

This is technically true, but in the same way that under a preemptive operating system there's no guarantee that you'll ever be scheduled. The sleep is a minimum time you will sleep for, beyond that you already have no guarantee about when the next instruction executes.

dwattttt•6mo ago
Spurious wakeups enters the chat.
quietbritishjim•6mo ago
Any reasonable sleep API will handle spurious wake ups under the hood and not return control.
Joker_vD•6mo ago
It will also ignore those pesky signals and restart the sleep :)

On a more serious note, any reasonable sleep API should either report all wake ups, no matter the reason, or none except for the timeout expiration itself. Any additional "reasonable" filtering is a loaded footgun because different API users have different ideas of what is "reasonable" for them.

anarazel•6mo ago
The usleep here isn't one that was intended to be taken frequently (and it's taken in a loop, so a too short sleep is ok). As it was introduced, it was intended to address a very short race that would have been very expensive to avoid altogether. Most of the waiting is intended to be done by waiting for a "transaction lock".

Unfortunately somebody decided that transaction locks don't need to be maintained when running as a hot standby. Which turns that code into a loop around the usleep...

(I do agree that sleep loops suck and almost always are a bad idea)

OptionOfT•6mo ago
If I were reviewing that code I would've asked if they tried yielding. Same effect, but doesn't delay when there is no one to yield to.
aspbee555•6mo ago
I was working through some ffmpeg timing stuff and the ai kept insisting on putting in sleep functions to try and "solve" it
jeffbee•6mo ago
Generative coding models are trained on GitHub. GitHub is completely full of inadvisable code. The model thinks this is normal.
nightfly•6mo ago
Sort of by definition it is normal
Joker_vD•6mo ago
The tl;dr is that Postgres, as any long-running "server" process (especially as a DBMS server!) does not run with SIG_DFL as the handler for SIGTERM; it instead sets up the signal handler that merely records the fact that the signal has happened, in hopes that whatever loops are going on will eventually pick it up. As usual, some loops don't but it's very hard to notice.
b112•6mo ago
Indeed. I've seen DBMSes take close to 10 minutes to gracefully exit, even when idle.

Timeout in sysvinit and service files, for a graceful exit, is typically 900 seconds.

Most modern DBMS daemons will recover if SIGKILL, most of the time, especially if you're using transactions. But startup will then be lagged, as it churns through and resolves on startup.

(unless you've modified the code to short cut things, hello booking.com "we're smarter than you" in 2015 at least, if not today)

sjsdaiuasgdia•6mo ago
Yeah, as I've said on many incident calls...we can pay for transaction recovery at shutdown or we can pay for it at startup, but it's got to happen somewhere.

The "SIGKILL or keep waiting?" decision comes down to whether you believe the DBMS is actually making progress towards shutdown. Proving that the shutdown is progressing can be difficult, depending on the situation.

namibj•6mo ago
Feels like they should come with watchdog timers that alert when a loop fails to check in on this status often enough? Then the only thing that still needs manual checking to cover these more-exotic states is that the code path for loop watchdog-and-signal-checkin will indeed lead to early exit from the loop it's placed in (and all surrounding loops).

Sure, it's more like fuzzing then, but running in production on systems that are willing to send in big reports for logged watchdog alerts would give nearly free monitoring of very diverse and realistic workloads to hopefully cover at least relevant-in-practice codepaths with many samples before a signal ever happens to those more-rare but still relevant code paths...

eunos•6mo ago
With quirk like these, honestly I'm not even confident how can PostgreSQL or any software in general can be used in mission critical system.
contravariant•6mo ago
Might be a bit late now to start telling people that maybe using software for everything wasn't a good idea.
eunos•6mo ago
But I want my Software-Defined-Aviation now
wolrah•6mo ago
And it's not like hardware can't get itself in to infinite loops (often with destructive results instead of just denial of service).

If you think leaking memory or handles is bad, watch what happens when a turbo starts leaking oil in to the intake tract of a diesel engine. It's exciting, as long as you're not paying for it or in the shrapnel zone.

vladms•6mo ago
Have you ever read the contraindication list of a medicine? It is the same kind of trade-off. Someone can choose to use something in a "mission critical system" (whatever that is) because the alternatives are worse, not because the chosen solution is perfect (that does not mean I would advise using PostgreSQL - just accepting that it can happen).

On the other hand "quirks like these" are the reason you should not update too often mission critical systems.

cedws•6mo ago
When you look hard enough it’s a miracle anything works at all.
OkPin•6mo ago
Fascinating root cause: a missing CHECK_FOR_INTERRUPTS() left pg_create_logical_replication_slot basically unkillable on hot standbys. Simple fix, but huge impact.

Makes me wonder how many other Postgres processes might ignore SIGTERM under edge conditions. Do folks here test signal handling during failovers or replica maintenance? Seems like something worth adding to chaos tests.

MuffinFlavored•6mo ago
Is that the root cause though? Why did this process get into a position that it needed to be manually killed/restarted, isn't that the real problem?
bhaak•6mo ago
> While the Postgres community has an older and sometimes daunting contribution process of submitting patches to a mailing list, [...]

The Linux kernel works the same way.

What are other big open source projects using if they are not using mailing lists?

I could imagine some using GitHub but IME it's way less efficient than mailing list. (If I were a regular contributor to a big project using GitHub, me being me, I would probably look for or write myself a GitHub to e-mail gateway).

Bigpet•6mo ago
You can already subscribe to projects or single issues/PRs on github and reply via email to post comments.

I can understand not wanting to use GitHub/GitLab/etc. for various reasons. But I don't understand how usability vs mailing lists is one.

How is a set of 9+ mailing lists any better? It has significantly worse discovery and search tools unless you download all the archives. So you're creating a hurdle for people there already.

Then you have people use all kinds of custom formatting in their email clients, so consistent readability is out the window.

People will keep top-posting (TOFU), transforming the inconsistent styles in the process. Creating an unnecessarily complicated problem for your email client to "detect quotes", or you have to keep reminding people.

Enforcing structure of any kind in email lists seems so tedious. I'm not advocating for bugzilla style "file out these 20 nonsensical fields before you can report anything" but some minimal structure enforced by some tooling as opposed to manual moderation seems very helpful to me.

bhaak•6mo ago
It's not about usability for newcomers (a mailing list is really not ideal for them) but usability for regular contributors and especially for lengthy discussions.

Call me old fashioned but there is no better way of discussing stuff online in an asynchronous way (OTOH video calls are better, but face to face doesn't scale) than a threaded medium. We don't have a better tool to manage threaded discussions than what mail clients (or Usenet clients but they are almost identical in handling).

Linear discussion forums (which GitHub issues are) are just inferior.

voidfunc•6mo ago
Kubernetes works through GitHub and that's probably one of the bigger ones.
gpavanb•6mo ago
The fundamental problem is that a synchronous, lock-based approach is being used in an asynchronous, event-driven context (logical replication).

The Postgres team needs to go back to the drawing board and avoid polling altogether and more importantly have event-listener based approaches for primary and replicas separately

It's great to see ClickHouse contributing to Postgres though. Cross-pollination between two large database communities can have a multiplying effect.

anarazel•6mo ago
> The fundamental problem is that a synchronous, lock-based approach is being used in an asynchronous, event-driven context (logical replication).

This isn't hit in ongoing replication, this is when creating the resources for a new replica on the publisher. And even there the poll based thing is only hit due to a function being used in a context that wasn't possible at the time it was being written (waiting for transactions to complete in a hot standby).

> The Postgres team needs to go back to the drawing board and avoid polling altogether and more importantly have event-listener based approaches for primary and replicas separately

That's, gasp, actually how it already works.

outworlder•6mo ago
> That's, gasp, actually how it already works.

Right?

Armchair software engineers are so incredibly annoying.

Postgres codebase is a shining example of good engineering. They walk a very fine line between many drawbacks. Code is battle tested (and surprisingly readable). Not many people (or groups) can do much better - and if they think they can, they are probably going to learn a lesson in humility.

anarazel•6mo ago
> Postgres codebase is a shining example of good engineering. They walk a very fine line between many drawbacks. Code is battle tested (and surprisingly readable). Not many people (or groups) can do much better - and if they think they can, they are probably going to learn a lesson in humility.

To be fair, we have/PG has plenty crud-y code... Some of it even written by yours truly :(. Including some of the code involved here.

But just saying "go back to the drawing board" without really understanding the problem isn't particularly useful.

gpavanb•6mo ago
I'd be happy to make the changes to the codebase but there's definitely scope for improvement. Putting a sleep to avoid repeated lock acquisition should have raised an eyebrow or two when it was implemented.

As mentioned, there should be a read-replica specific code that works using event listeners. Repurposing the primary's code is what causes the issue.

The following pseudocode would give a better picture on the next steps

// Read replica specific implementation

void StandbyWaitForTransactionCompletion(TransactionId xid) {

// Register to receive WAL notifications about this transaction RegisterForWALNotification(xid);

    while (TransactionIdIsInProgress(xid)) {
        // Wait for WAL record indicating transaction completion
        WaitForWALEvent();
        CHECK_FOR_INTERRUPTS();
        
        // Process any incoming WAL records
        ProcessIncomingWAL();
    }
    
    UnregisterForWALNotification(xid);
}
anarazel•6mo ago
This has absolutely nothing to do with the problem at hand.
caffeinated_me•6mo ago
Good find! I've seen similar behavior before and was wondering why it wasn't easy to stop.

This isn't the only place Postgres can act like this, though. I've seen similar behavior when a foreign data wrapper times out or loses connection, and had to resort to either using kill -9 or attaching to the process using a debugger and closing the socket, which oddly enough also worked.

Might be worth generalizing this approach to also handle that kind of failure

fmajid•6mo ago
I’ve seen this behavior (backend processes don’t honor pg_terminate_backend(pid); ), albeit in an older version (PostgreSQL 14), and it was caused by excessively long plans for the LLVM-based JIT. Disabling the JIT fixed it.