frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

First, make me care

https://gwern.net/blog/2026/make-me-care
302•andsoitis•6h ago•100 comments

Clawdbot - open source personal AI assistant

https://github.com/clawdbot/clawdbot
27•KuzeyAbi•45m ago•9 comments

A macOS app that blurs your screen when you slouch

https://github.com/tldev/posturr
460•dnw•9h ago•160 comments

Show HN: A small programming language where everything is pass-by-value

https://github.com/Jcparkyn/herd
32•jcparkyn•2h ago•10 comments

Case study: Creative math – How AI fakes proofs

https://tomaszmachnik.pl/case-study-math-en.html
25•musculus•2h ago•14 comments

Scientists identify brain waves that define the limits of 'you'

https://www.sciencealert.com/scientists-identify-brain-waves-that-define-the-limits-of-you
8•mikhael•1h ago•0 comments

Guix for Development

https://dthompson.us/posts/guix-for-development.html
15•clircle•5d ago•1 comments

Doom has been ported to an earbud

https://doombuds.com
336•arin-s•12h ago•105 comments

Oneplus phone update introduces hardware anti-rollback

https://consumerrights.wiki/w/Oneplus_phone_update_introduces_hardware_anti-rollback
334•validatori•4h ago•148 comments

Spanish track was fractured before high-speed train disaster, report finds

https://www.bbc.com/news/articles/c1m77dmxlvlo
126•Rygian•6h ago•111 comments

A flawed paper in management science has been cited more than 6k times

https://statmodeling.stat.columbia.edu/2026/01/22/aking/
619•timr•16h ago•326 comments

Using PostgreSQL as a Dead Letter Queue for Event-Driven Systems

https://www.diljitpr.net/blog-post-postgresql-dlq
159•tanelpoder•9h ago•49 comments

The '3.5% rule': How a small minority can change the world (2019)

https://www.bbc.com/future/article/20190513-it-only-takes-35-of-people-to-change-the-world
163•choult•3h ago•124 comments

Show HN: Elo ranking for landing pages

https://landingleaderboard.com/
7•Intragalactic•31m ago•0 comments

Open letter from more than 60 CEOs of Minnesota-based companies

https://www.mnchamber.com/blog/open-letter-more-60-ceos-minnesota-based-companies
14•SilverElfin•22m ago•3 comments

Bitwise conversion of doubles using only FP multiplication and addition (2020)

https://dougallj.wordpress.com/2020/05/10/bitwise-conversion-of-doubles-using-only-floating-point...
15•vitaut•10h ago•1 comments

Show HN: An interactive map of US lighthouses and navigational aids

https://www.lighthouses.app/
26•idd2•7h ago•8 comments

I was right about ATProto key management

https://notes.nora.codes/atproto-again/
106•todsacerdoti•5h ago•63 comments

Web-based image editor modeled after Deluxe Paint

https://github.com/steffest/DPaint-js
173•bananaboy•12h ago•15 comments

The behavioral cost of personalized pricing

https://digitalseams.com/blog/the-behavioral-cost-of-personalized-pricing
52•bobbiechen•5h ago•29 comments

Infinite pancakes, anyone?

https://www.nytimes.com/2026/01/20/science/infinite-pancake-math-puzzle.html
17•cainxinth•3d ago•3 comments

Show HN: FaceTime-style calls with an AI Companion (Live2D and long-term memory)

https://thebeni.ai/
3•summerlee9611•1h ago•0 comments

Introduction to PostgreSQL Indexes

https://dlt.github.io/blog/posts/introduction-to-postgresql-indexes/
287•dlt•17h ago•14 comments

Show HN: Bonsplit – Tabs and splits for native macOS apps

https://bonsplit.alasdairmonk.com
203•sgottit•13h ago•26 comments

ICE using Palantir tool that feeds on Medicaid data

https://www.eff.org/deeplinks/2026/01/report-ice-using-palantir-tool-feeds-medicaid-data
854•JKCalhoun•7h ago•505 comments

Hackable personal news reader in bash pipes

https://github.com/haron/news.sh
21•haron•5d ago•5 comments

Nango (YC W23, Dev Infrastructure) Is Hiring Remotely

https://jobs.ashbyhq.com/Nango
1•bastienbeurier•13h ago

Optimizing GPU Programs from Java Using Babylon and Hat

https://openjdk.org/projects/babylon/articles/hat-matmul/hat-matmul
25•pjmlp•5d ago•2 comments

Show HN: Netfence – Like Envoy for eBPF Filters

https://github.com/danthegoodman1/netfence
40•dangoodmanUT•9h ago•6 comments

LED lighting undermines visual performance unless supplemented by wider spectra

https://www.nature.com/articles/s41598-026-35389-6
60•bookofjoe•3h ago•30 comments
Open in hackernews

Ask HN: How do you keep system context from rotting over time?

32•kennethops•5d ago
Former SRE here, looking for advice.

I know there are a lot of tools focused on root cause analysis after things break. Cool, but that’s not what’s wearing me down. What actually hurts is the constant context switching while trying to understand how a system fits together, what depends on what, and what changed recently.

As systems grow, this feels like it gets exponentially harder. Add logs and now you’ve created a million new events to reason about. Add another database and suddenly you’re dealing with subnet constraints or a DB choice that’s expensive as hell, and no one noticed until later. Everyone knows their slice, but the full picture lives nowhere, so bit rot just keeps creeping in.

This feels even worse now that AI agents are pushing large amounts of code and config changes quickly. Things move faster, but shared understanding falls behind even faster.

I’m honestly stuck on how people handle this well in practice. For folks dealing with real production systems, what’s actually helped? Diagrams, docs, tribal knowledge, tooling, something else? Where does it break down?

Comments

amadeuswoo•5d ago
One thing that’s evidently helped: using CLAUDE.md / agent instructions as de facto architecture docs. If the agent needs to understand system boundaries to work effectively, those docs actually get maintained
kennethops•4d ago
But how do you ensure the .md file is able to see all of the details of the infra?
amadeuswoo•4d ago
You don't, it's a map of intent, not infra state. What exists, why, what talks to what. Live state still needs IaC and observability. The .md captures the 'why' that terraform can't
htrp•4d ago
I don't think OP is looking for context from the AI model perspective but rather a process for maintaining a mental picture of the system architecture and managing complexity.

I'm not sure I've seen any good vendors but I remember seeing a reverse devops tool posted a few days ago that would reverse engineer your VMs into Ansible code. If that got extended to your entire environment, that would almost be an auto documenting process.

dexdal•4d ago
Context rots when it stays implicit. Make the system model an explicit artifact with fixed inputs and checkpoints, then update it on purpose. Otherwise you keep rebuilding the same picture from scratch.
kennethops•4d ago
Im honestly looking for both. I haven't found a vender to do this well for just humans nor am I seeing something that can expose this context, read only, to all of the ai agent coding models

I will check that tool out.

liveoneggs•4d ago
Monitoring tools (APM) will show dependencies (web calls, databases, etc) and should contain things like deployment markers and trend lines.

All of those endpoints should be documented in an environment variable or similar as well.

The breakdown is when you don't instrument the same tooling everywhere.

Documentation is generally out of date by the time you finish writing it so I don't really bother with much detail there.

kennethops•4d ago
This has been my experience as well. imo documentation feels like one of the few areas that AI can be good at today.
liveoneggs•4d ago
It's okay but it often lies. At an SRE level you need a pretty zoomed-out view of the world until you are trying to zoom-in to a problem component.

Always start at the head (what a customer sees -- actually load the website) and work down into each layer.

If something is breaking way downstream and customers don't see it then it doesn't actually matter right now.

nitwit005•4d ago
Every company I've worked with has started with an ER diagram for their primary database (and insisted on it, in fact), only to give up when it became too complex. You quickly hit the point where no one can understand it.

You then eventually have that same pattern happen with services, where people give up on mapping the full thing out as well.

What I've done for my current team is to list the "downstream" services, what we use them for, who to contact, etc. It only goes one level deep, but it's something that someone can read quickly during an incident.

kennethops•4d ago
Sorry what is an ER diagram?
gnabgib•4d ago
First hits on DDG, anonymous Google, Bing

ERD/ Entity Relationship Diagram https://www.lucidchart.com/pages/er-diagrams

ERM / Entity-Relationship Model https://en.wikipedia.org/wiki/Entity%E2%80%93relationship_mo...

(same-same, ERD is the more common acronym)

kennethops•4d ago
That is what I figured it would be, but you never know anymore with the amount of acronyms thrown around nowadays.
canhdien_15•4d ago
If the system is so good, why constantly change the context?
BOOSTERHIDROGEN•4d ago
I think it is because of continous improvement mindset.
canhdien_15•4d ago
Continuous improvement is essential, but we must distinguish between progress and mere decoration. If an old car runs perfectly and a new one offers the same speed but with a different shell, why replace the entire vehicle? It’s a waste of time and resources. Why not focus on upgrading the 'shell' instead of reinventing the wheel?
kennethops•4d ago
but think about the shareholders!
dlcarrier•4d ago
Good hierarchical documentation

A laptop computer is extremely complex, but is actively developed and maintained by a small number of people, built on parts themselves developed by a small number of people, many of which are themselves built on parts themselves developed by a small number of people, and so on and so forth.

This works well in electronics design, because everything is documented and tested to comply with the documentation. You'd think this would slow things down, but developing a new generation of a laptop takes fewer man hours and less calendar time than developing a new generation of any software of a similar complexity running on it, despite the laptop skirting with the limitations of physics. Technical debt adds up really fast.

The top-level designers only have access to what the component manufacturers have published, and not to their internal designs, but that doesn't matter because the publications include correct and relevant data. When the component manufacturer comes out with something new, they use documentation from their supplier, to design the new product.

As long as each components of documentation is complete and accurate, it will meet all of the needs of anyone using that component. Diving deeper would only be necessary if something is incomplete or inaccurate.

linux4dummies•4d ago
I use nix (nixos) with AI-agents. Its everything i ever dreamed of and a bit more. Makes all other distros and buildsystems look old and outdated :D
kennethops•4d ago
Woah what are you doing?
IceCoffe•4d ago
Yea im curious too, is this because most of your system can be explained by nixos configuration ? So the LLM can easily fetch context?
sonofhans•1h ago
More humans. Seriously. Keep more humans in the loop. Everything else does and will fail. Humans add resilience to systems; demand and complexity reduce resilience.

You’re describing the infrastructure of a large system — it’s a custom-built machine designed to serve a custom purpose. There are no examples in the world of things like that working without a lot of human intervention.

This is compounded, as you say, by increasing demands placed on the system: “Now it must react to AIs committing code,” or “Our customer base is growing but your Ops budget is decreasing.” This means the system needs more humans, not fewer.

reactordev•1h ago
This is not what he asked.

Adding more humans seems like an immediate fix but systems of systems exist without humans.

Observability, automation, infrastructure as code, audits, all these things compliment the “wtf happened?” scenario and all of these are systems. Not humans.

The SRE needs signal from noise.

gtirloni•1h ago
I think you should have added a disclaimer that you are the founder of company that provides "Reliability and context for complex environments."

It feels a bit dishonest to be asking for advice on how to tackle the complexity problem for SREs when you're are actually providing a paid solution for the very same problem.

shaneoh•1h ago
I'm seeing this pattern pop up more and more all over the place now. It's pervasive throughout Reddit too for example: pick a sub in the area that you built your app in, pose some problem, and then have another account also controlled by you present the solution that you built. All the writing styles in these posts are similar too; it's all likely written by AI, including the post we're commenting on.
v_CodeSentinal•57m ago
I've been working on this problem specifically in the context of autonomous coding agents, and you hit the nail on the head with 'implicit context'.

The biggest issue isn't just that documentation gets outdated; it's that the 'mental model' of the system only exists accurately in a few engineers' heads at any given moment. When they leave or rotate, that model degrades.

We found the only way to really fight this is to make the system self-documenting in a semantic way—not just auto-generated docs, but maintaining a live graph of dependencies and logic that can be queried. If the 'map' of the territory isn't generated from the territory automatically, it will always drift. Manual updates are a losing battle.

xyzzy_plugh•18m ago
1. You make it declarative. The system definition should be checked in to a repository, or multiple repos. If you're not using infrastructure as code, you should be. This is table stakes.

2. Systems should be explicit, not implicit. Configuration should be explicit wherever possible. Implicit behavior should be documented.

3. Living documentation adjacent to your systems. Write markdown files next to your code. If you keep systems documentation somewhere else (like some wysiwyg knowledge system bullshit) then you must build a markdown-to-whatever sync job (where the results are immutable) else the documentation is immediately out of date, and out of date documentation is just harmful noise.

4. If it's dead, delete it. You have version control for a reason. Don't keep cruft around. If there's a subnet that isn't being used, delete it.

Lastly, if you find yourself in this situation and have none of the above, ask yourself if you really have the agency to fix it -- and I mean really fix it, no half measures -- then do so. If you don't, then your options are to stop caring or find a new job. The alternative is a recipe for burnout.

wewewedxfgdf•16m ago
Since you are "I’m honestly stuck on how people handle this well in practice", I have the advice you seek:

Why, use https://opscompanion.ai of course!

I hear it provides:

A single View Into Your Systems

Connects your infrastructure into a single, accurate view across clouds and services, helping teams reason about changes, reduce risk, and prevent outages and security issues before they happen.

Operational Context - understands your live infrastructure and how it’s used over time, from user behavior to internet traffic to how services run across clouds.

It is your Stack, Visualized, Understood, Investigated.

Nobody Understands Their Infrastructure. Teams waste hours in data that never explains what happened. Incidents drag on because context is scattered. Engineers manage complexity instead of shipping value.

Hopefully my advice helps you understand and get to grips with the problems you face as a former SRE.