frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Pure Silicon Demo Coding: No CPU, No Memory, Just 4k Gates

https://www.a1k0n.net/2025/12/19/tiny-tapeout-demo.html
61•a1k0n•1h ago•0 comments

Log level 'error' should mean that something needs to be fixed

https://utcc.utoronto.ca/~cks/space/blog/programming/ErrorsShouldRequireFixing
125•todsacerdoti•3d ago•62 comments

Go ahead, self-host Postgres

https://pierce.dev/notes/go-ahead-self-host-postgres#user-content-fn-1
197•pavel_lishin•2h ago•146 comments

Gemini 3 Pro vs. 2.5 Pro in Pokemon Crystal

https://blog.jcz.dev/gemini-3-pro-vs-25-pro-in-pokemon-crystal
156•alphabetting•4d ago•40 comments

Immersa: Open-source Web-based 3D Presentation Tool

https://github.com/ertugrulcetin/immersa
79•simonpure•4h ago•11 comments

NTP at NIST Boulder Has Lost Power

https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/ACADD3NKOG2QRWZ56OSNNG7UIEKKT...
330•lpage•10h ago•152 comments

Maximizing Compression of Apple II Hi-Res Images

http://deater.net/weave/vmwprod/hgr_compress/
12•deater•4d ago•1 comments

Skills Officially Comes to Codex

https://developers.openai.com/codex/skills/
172•rochansinha•9h ago•86 comments

Over 40% of Deceased Drivers in Vehicle Crashes Test Positive for THC: Study

https://www.facs.org/media-center/press-releases/2025/over-40-of-deceased-drivers-in-motor-vehicl...
66•bookofjoe•1h ago•68 comments

Show HN: Claude Code Plugin to play music when waiting on user input

https://github.com/Sevii/agent-marketplace/blob/main/plugins/elevator-music/README.md
12•Sevii•1h ago•3 comments

CSS Grid Lanes

https://webkit.org/blog/17660/introducing-css-grid-lanes/
652•frizlab•19h ago•195 comments

Privacy doesn't mean anything anymore, anonymity does

https://servury.com/blog/privacy-is-marketing-anonymity-is-architecture/
269•ybceo•11h ago•190 comments

What Does a Database for SSDs Look Like?

https://brooker.co.za/blog/2025/12/15/database-for-ssd.html
118•charleshn•7h ago•91 comments

Reflections on AI at the End of 2025

https://antirez.com/news/157
130•danielfalbo•8h ago•198 comments

Mistral OCR 3

https://mistral.ai/news/mistral-ocr-3
636•pember•2d ago•115 comments

Charles Proxy

https://www.charlesproxy.com/
250•handfuloflight•11h ago•93 comments

Raycaster (YC F24) Is Hiring a Research Engineer (NYC, In-Person)

1•levilian•6h ago

Arduino UNO Q bridges high-performance computing with real-time control

https://www.arduino.cc/product-uno-q/
33•doener•4d ago•18 comments

Garage – An S3 object store so reliable you can run it outside datacenters

https://garagehq.deuxfleurs.fr/
649•ibobev•1d ago•143 comments

A train-sized tunnel is now carrying electricity under South London

https://www.ianvisits.co.uk/articles/a-train-sized-tunnel-is-now-carrying-electricity-under-south...
84•zeristor•9h ago•69 comments

New Quantum Antenna Reveals a Hidden Terahertz World

https://www.sciencedaily.com/releases/2025/12/251213032617.htm
97•aacker•4d ago•4 comments

Airbus to migrate critical apps to a sovereign Euro cloud

https://www.theregister.com/2025/12/19/airbus_sovereign_cloud/
352•saubeidl•9h ago•298 comments

Extending ts-Wolfram: dropping OOP, kernel/userspace interop, better printing

https://www.spakhm.com/ts-wolfram-ext
3•lioeters•3d ago•1 comments

A terminal emulator that runs in your terminal. Powered by Turbo Vision

https://github.com/magiblot/tvterm
100•mariuz•3d ago•14 comments

Contrails Map

https://map.contrails.org/
107•schaum•10h ago•48 comments

Hash tables in Go and advantage of self-hosted compilers

https://rushter.com/blog/go-and-hashmaps/
51•f311a•5d ago•34 comments

NOAA deploys new generation of AI-driven global weather models

https://www.noaa.gov/news-release/noaa-deploys-new-generation-of-ai-driven-global-weather-models
132•hnburnsy•2d ago•86 comments

A proof of concept of a semistable C++ vector container

https://github.com/joaquintides/semistable_vector
21•joaquintides•4d ago•6 comments

LLM Year in Review

https://karpathy.bearblog.dev/year-in-review-2025/
322•swyx•21h ago•119 comments

TP-Link Tapo C200: Hardcoded Keys, Buffer Overflows and Privacy

https://www.evilsocket.net/2025/12/18/TP-Link-Tapo-C200-Hardcoded-Keys-Buffer-Overflows-and-Priva...
323•sibellavia•23h ago•104 comments
Open in hackernews

Log level 'error' should mean that something needs to be fixed

https://utcc.utoronto.ca/~cks/space/blog/programming/ErrorsShouldRequireFixing
124•todsacerdoti•3d ago

Comments

shadowgovt•2h ago
This is the standard I use as well. In general, my rule of thumb is that if something is logging error, it would have been perfectly reasonable for the program to respond by crashing, and the only reason it didn't is that it's executing in some kind of larger context that wants to stay up in the event of the failure of an individual component (like one handler suffering a query that hangs it and having to be terminated by its monitoring program in a program with multiple threads serving web requests). In contrast, something like an ill-formed web query from an untrusted source isn't even an error because you can't force untrusted sources to send you correctly formed input.

Warning, in contrast, is what I use for a condition that the developer predicted and handled but probably indicates the larger context is bad, like "this query arrived from a trusted source but had a configuration so invalid we had to drop it on the floor, or we assumed a default that allowed us to resolve the query but that was a massive assumption and you really should change the source data to be explicit." Warning is also where I put things like "a trusted source is calling a deprecated API, and the deprecation notification has been up long enough that they really should know better by now."

Where all of this matters is process. Errors trigger pages. Warnings get bundled up into a daily report that on-call is responsible for following up on, sometimes by filing tickets to correct trusted sources and sometimes by reaching out to owners of trusted sources and saying "Hey, let's synchronize on your team's plan to stop using that API we declared is going away 9 months ago."

raldi•1h ago
Right. If staging or the canary is logging errors, you block/abort the deploy. If it’s logging warnings, that’s normal.
nlawalker•56m ago
It seems that the easier rule of thumb, then, is that "application logic should never log an error on its own behalf unless it terminates immediately after", and that error-level log entries should only ever be generated from a higher-level context by something else that's monitoring for problems that the application code itself didn't anticipate.
raldi•2h ago
Yes. Examples of non-defects that should not be in the ERROR loglevel:

* Database timeout (the database is owned by a separate oncall rotation that has alerts when this happens)

* ISE in downstream service (return HTTP 5xx and increment a metric but don’t emit an error log)

* Network error

* Downstream service overloaded

* Invalid request

Basically, when you make a request to another service and get back a status code, your handler should look like:

    logfunc = logger.error if 400 <= status <= 499 and status != 429 else logger.warning
(Unless you have an SLO with the service about how often you’re allowed to hit it and they only send 429 when you’re over, which is how it’s supposed to work but sadly rare.)
jonathrg•1h ago
4xx is for invalid requests. You wouldn't log a 404 as an error
raldi•1h ago
I’m talking about codes you receive from services you call out to.
jonathrg•1h ago
Oh that makes sense.
raldi•1h ago
There are still some special cases, because 404 is used for both “There’s no endpoint with that name” and “There’s no record with the ID you tried to look up.”
mewpmewp2•19m ago
What if user sends some sort of auth token or other type of data that you yourself can't validate and third party gives you 4xx for it? You won't know ahead of time whether that token or data is valid, only after making a request to the third party.
Hizonner•1h ago
> Database timeout (the database is owned by a separate oncall rotation that has alerts when this happens)

So people writing software are supposed to guess how your organization assigns responsibilities internally? And you're sure that the database timeout always happens because there's something wrong with the database, and never because something is wrong on your end?

raldi•1h ago
No; I’m not understanding your point about guessing. Could you restate?

As for queries that time out, that should definitely be a metric, but not pollute the error loglevel, especially if it’s something that happens at some noisy rate all the time.

makeitdouble•1h ago
> the database is owned by a separate oncall rotation

Not OP, but this part hits the same for me.

In the case your client app is killing the DB through too many calls (e.g. your cache is not working) you should be able to detect it and react, without waiting for the DB team to come to you after they investigated the whole thing.

But you can't know in advance if the DB connection errors are your fault or not, so logging it to cover the worse case scenario (you're the cause) is sensible.

raldi•1h ago
I agree that you should detect this, just through a metric rather than putting DB timeouts in the ERROR loglevel.
electroly•47m ago
I think OP is making two separate but related points, a general point and a specific point. Both involve guessing something that the error-handling code, on the spot, might not know.

1. When I personally see database timeouts at work it's rarely the database's fault, 99 times out of 100 it's the caller's fault for their crappy query; they should have looked at the query plan before deploying it. How is the error-handling code supposed to know? I log timeouts (that still fail after retry) as errors so someone looks at it and we get a stack trace leading me to the bad query. The database itself tracks timeout metrics but the log is much more immediately useful: it takes me straight to the scene of the crime. I think this is OP's primary point: in some cases, investigation is required to determine whether it's your service's fault or not, and the error-handling code doesn't have the information to know that.

2. As with exceptions vs. return values in code, the low-level code often doesn't know how the higher-level caller will classify a particular error. A low-level error may or may not be a high-level error; the low-level code can't know that, but the low-level code is the one doing the logging. The low-level logging might even be a third party library. This is particularly tricky when code reuse enters the picture: the same error might be "page the on-call immediately" level for one consumer, but "ignore, this is expected" for another consumer.

I think the more general point (that you should avoid logging errors for things that aren't your service's fault) stands. It's just tricky in some cases.

zbentley•1h ago
I wish I lived in a world where that worked. Instead, I live in a world where most downstream service issues (including database failures, network routing misconfigurations, giant cloud provider downtime, and more ordinary internal service downtime) are observed in the error logs of consuming services long before they’re detected by the owners of the downstream service … if they ever are.

My rough guess is that 75% of incidents on internal services were only reported by service consumers (humans posting in channels) across everywhere I’ve worked. Of the remaining 25% that were detected by monitoring, the vast majority were detected long after consumers started seeing errors.

All the RCAs and “add more monitoring” sprints in the world can’t add accountability equivalent to “customers start calling you/having tantrums on Twitter within 30sec of a GSO”, in other words.

The corollary is “internal databases/backend services can be more technically important to the proper functioning of your business, but frontends/edge APIs/consumers of those backend services are more observably important by other people. As a result, edge services’ users often provide more valuable telemetry than backend monitoring.”

raldi•1h ago
But everything you’re describing can be done with metrics and alerts; there’s no need to spam the ERROR loglevel.
zbentley•42m ago
My point is that just because those problems can be solved with better telemetry doesn’t mean that is actually done in practice. Most organizations do are much more aware of/sensitive to failures upstream/at the edge than they are in backend services. Once you account for alert fatigue, crappy accountability distribution, and organizational pressures, even the places that do this well often backslide over time.

In brief: drivers don’t obey the speed limit and backend service operators don’t prioritize monitoring. Both groups are supposed to do those things, but they don’t and we should assume they won’t change. As a result, it’s a good idea to wear seatbelts and treat downstream failures as urgent errors in the logs of consuming services.

jillesvangurp•1h ago
Errors mean I get alerted. Zero tolerance on that from my side.
mfuzzey•1h ago
I think it's difficult to say without knowing how the system is deployed and administered. "If a SMTP mailer trying to send email to somewhere logs 'cannot contact port 25 on <remote host>', that is not an error in the local system"

Maybe or maybe not. If the connection problem is really due to the remote host then that's not the problem of the sender. But maybe the local network interface is down, maybe there's a local firewall rule blocking it,...

If you know the deployment scenario then you can make reasonable decisions on logging levels but quite often code is generic and can be deployed in multiple configurations so that's hard to do

greatgib•1h ago
The point is that if your program itself take note of the error from the library it is ok. You, as the program owner, can decide what to do with it (error log or not).

But if you are the SMTP library and that you unilaterally log that as an error. That is an issue.

zamadatix•41m ago
The counterpoint made above is while what you describe is indeed the way the author likes to see it that doesn't explain why "an error is something which failed that the program was unable to fix automatically" is supposed to be any less valid a way to see it. I.e. should error be defined as "the program was unable to complete the task you told it to do" or only "things which could have worked but you need to explicitly change something locally".

I don't even know how to say whether these definitions are right or wrong, it's just whatever you feel like it should be. The important thing is what your program logs should be documented somewhere, the next most important thing is that your log levels are self consistent and follow some sort of logic, and that I would have done it exactly the same is not really important.

At the end of the day, this is just bikeshedding about how to collapse ultra specific alerting levels into a few generic ones. E.g. RFC 5424 defines 8 separate log levels for syslog and, while that's not a ceiling by any means, it's easy to see how there's already not really going to be a universally agreed way to collapse even just these down to 4 categories.

colechristensen•40m ago
How about this:

- An error is an event that someone should act on. Not necessarily you. But if it's not an event that ever needs the attention of a person then the severity is less than an error.

Examples: Invalid credentials. HTTP 404 - Not Found, HTTP 403 Forbidden, (all of the HTTP 400s, by definition)

It's not my problem as a site owner if one of my users entered the wrong URL or typed their password wrong, but it's somebody's problem.

A warning is something that A) a person would likely want to know and B) wouldn't necessarily need to act on

INFO is for something a person would likely want to know and unlikely needs action

DEBUG is for something likely to be helpful

TRACE is for just about anything that happens

EMERG/CRIT are for significant errors of immediate impact

PANIC the sky is falling, I hope you have good running shoes

DanHulton•24m ago
If you're logging and reporting on ERRORs for 400s, then your error triage log is going to be full of things like a user entering a password with insufficient complexity or trying to sign up with an email address that already exists in your system.

Some of these things can be ameliorated with well-behaved UI code, but a lot cannot, and if your primary product is the API, then you're just going to have scads of ERRORs to triage where there's literally nothing you can do.

I'd argue that anything that starts with a 4 is an INFO, and if you really wanted to be through, you could set up an alert on the frequency of these errors to help you identify if there's a broad problem.

colechristensen•15m ago
You have HTTP logs tracked, you don't need to report them twice, once in the HTTP log and once on the backend. You're just effectively raising the error to the HTTP server and its logs are where the errors live. You don't alert on single HTTP 4xx errors because nobody does, you only raise on anomalous numbers of HTTP 4xx errors. You do alert on HTTP 5xx errors because as "Internal" http errors those are on you always.

In other words, of course you don't alert on errors which are likely somebody else's problem. You put them in the log stream where that makes sense and can be treated accordingly.

theli0nheart•1h ago
I agree with this.

Not everything that a library considers an error is an application error. If you log an error, something is absolutely wrong and requires attention. If you consider such a log as "possibly wrong", it should be a warning instead.

alexwasserman•1h ago
I have been particularly irritated in the past where people use a lower log level and include the higher log level string in the message, especially where it's then parsed, filtered, and alerted on my monitoring.

eg. log level WARN, message "This error is...", but it then trips an error in monitoring and pages out.

Probably breaching multiple rules here around not parsing logs like that, etc. But it's cropped up so many times I get quite annoyed by it.

jonathrg•1h ago
Stuff like that is a good argument for using structured logging, but even if you are just parsing text logs, surely you can make the parser be a bit more specific when retrieving the log level.
dragonwriter•47m ago
> I have been particularly irritated in the past where people use a lower log level and include the higher log level string in the message, especially where it's then parsed, filtered, and alerted on my monitoring.

If your parsing, filtering, and monitoring setup parses strings that happen to correspond to log level names in positions other than that of log levels as having the semantics of log levels, then that's a parsing/filtering error, not a logging error.

vpribish•1h ago
I just started playing in the Erlang ecosystem and they have EIGHT levels of logging messages. it seems crazily over-specific, but they are the champions of robust systems.

I could live with 4

Error - alert me now.

Warning - examine these later,

Info - important context for investigations.

Debug - usually off in prod.

regularfry•1h ago
The eight levels in Erlang are inherited from syslog, rather than something specific to Erlang itself.
dnautics•1h ago
let's say you a bunch of database timeouts in a row. this might mean that nothing needs to be fixed. But also, the "thing that needs to be fixed" might be "the ethernet cable fell out the back of your server".

How do you know?

raldi•1h ago
You have an alert on what users actually care about, like the overall success rate. When it goes off, you check the WARNING log and metric dashboard and see that requests are timing out.
ImPostingOnHN•58m ago
That is a lagging indicator. By the time you're alerted, you've already failed by letting users experience an issue.
makeitdouble•1h ago
> This assumes an error/warning/info/debug set of logging levels instead of something more fine grained, but that's how many things are these days.

Does it ?

Don't most stacks have an additional level of triaging logs to detect anomalies etc ? It can be your New relic/DataDog/Sentry or a self made filtering system, but nowadays I'd assume the base log levels are only a rough estimate of whether an single event has any chance of being problematic.

I'd bet the author also has strong opinions about http error codes, and while I empathize, those ships have long sailed.

rwmj•1h ago
And the second rule is make all your error messages actionable. By that I mean it should tell me what action to take to fix the error (even if that action means hard work, tell me what I have to do).
andoando•1h ago
Error: Possible race condition, rewrite codebase
morkalork•42m ago
I have written out-of-band sanity checks that have caught race conditions, the recommendation is more like "<Thing> that should be locked, isn't. Check what was merged and deployed in the last 24h, someone ducked it up"
hyperadvanced•55m ago
This is just plain wrong, I vehemently disagree. What happens if a payment fails on my API, and today that means I need to go through a 20-step process with this pay provider, my database, etc. to correct that. But what’s worse is if this error happens 11,000 times and I run a script to do my 20 step process 11,000 times, but it turns out the error was raised in error. Additionally, because the error was so explicit about how to fix it, I didn’t talk to anyone. And of course, the suggested fix was out of date because docs lag vs. production software. Now I have 11,000 pissed off customers because I was trying to be helpful.
throw3e98•55m ago
Maybe that makes sense for a single-machine application where you also control the hardware. But for a networked/distributed system, or software that runs on the user's hardware, the action might involve a decision tree, and a log line is a poor way to convey that. We use instrumentation, alerting and runbooks for that instead, with the runbooks linking into a hyperlinked set of articles.

My 3D printer will try to walk you through basic fixes with pictures on the device's LCD panel, but for some errors it will display a QR code to their wiki which goes into a technical troubleshooting guide with complex instructions and tutorial videos.

chongli•46m ago
Suppose I'm writing an http server and the error is caused by a flaky power supply causing the disk to lose power when the server attempts to read a file that's been requested. How is the http server supposed to diagnose this or any other hardware fault? Furthermore, why should it even be the http server's responsibility to know about hardware issues at all?
uniq7•28m ago
The error doesn't need to be extremely specific or point to the actual root cause.

In your example,

- "Error while serving file" would be a bad error message,

- "Failed to read file 'foo/bar.html'" would be acceptable, and

- "Failed to read file 'foo/bar.html' due to EIO: Underlying device error (disk failure, I/O bus error). Please check the disk integrity." would be perfect (assuming the http server has access to the underlying error produced by the read operation).

1123581321•46m ago
Can you please explain this? That sounds like identifying bugs but not fixing them but I realize you don’t mean that. One hopes the context information in the error will make it actionable when it occurs, never completely successfully, of course.
magicalhippo•38m ago
This can be difficult or just not possible.

What is possible is to include as much information about what the system was trying to do. If there's an file IO error, include the the full path name. Saying "file not found" without saying which file was not found infuriates me like few other things.

If some required configuration option is not defined, include the name of the configuration option and from where it tried to find said configuration (config files, environment, registry etc). And include the detailed error message from the underlying system if any.

Regular users won't have a clue how to deal with most errors anyway, but by including details at least someone with some system knowledge has a chance of figuring out how to fix or work around the issue.

pixl97•30m ago
So what error do you put if the server is over 500 miles away?

https://web.mit.edu/jemorris/humor/500-miles

Or you can't connect because of a path MTU error.

Or because the TTL is set to low?

Your software at the server level has no idea what's going wrong at the network level, all you can send is some kind of network problem message.

jayofdoom•1h ago
In OpenStack, we explicitly document what our log levels mean; I think this is valuable from both an Operator and Developer perspective. If you're a new developer, without a sense of what log levels are for, it's very prescriptive and helpful. For an operator, it sets expectations.

https://docs.openstack.org/oslo.log/latest/user/guidelines.h...

FWIW, "ERROR: An error has occurred and an administrator should research the event." (vs WARNING: Indicates that there might be a systemic issue; potential predictive failure notice.)

quectophoton•1h ago
Thank you, this (and jillesvangurp's comment) sounds way more reasonable than the article's suggestion.

If I have a daily cron job that is copying files to a remote location (e.g. backups), and the _operation_ fails because for some reason the destination is not writable.

Your suggestion would get me _both_ alerts, as I want; the article's suggestion would not alert me about the operation failing because, after all, it's not something happening in the local system, the local program is well configured, and it's "working as expected" because it doesn't need neither code nor configuration fixing.

HarHarVeryFunny•1h ago
I agree with the sentiment, although not sure if "error" is the right category/verbiage for actionable logs.

In an ideal world things like logs and alarms (alerting product support staff) should certainly cleanly separate things that are just informative, useful for the developer, and things that require some human intervention.

If you don't do this then it's like "the boy that cried wolf", and people will learn to ignore errors and alarms since you've trained them to understand that usually no action is needed. It's also useful to be able to grep though log files and distinguish failures of different categories, not just grep for specific failures.

eterm•35m ago
How I'd personally like to treat them:

  - Critical / Fatal:  Unrecoverable without human intervention, someone needs to get out of bed, now.
  - Error : Recoverable without human intervention, but not without data / state loss. Must be fixed asap. An assumption didn't hold.
  - Warning: Recoverable without intervention. Must have an issue created and prioritised. ( If business as usual, this could be downgrading to INFO. )
The main difference therefore between error and warning is, "We didn't think this could happen" vs "We thought this might happen".

So for example, a failure to parse JSON might be an error if you're responsible for generating that serialisation, but might be a warning if you're not.

mewpmewp2•26m ago
What if you are integrated to a third party app and it gives you 5xx once? What do you log it as, and let's say after a retry it is fine.
bqmjjx0kac•24m ago
I would log a warning when an attempt fails, and an error when the final attempt fails.
mewpmewp2•8m ago
You are not the OP, but I think I was trying to point out this example case in relation to their descriptions of Error/Warnings.

This scenario may or may not yield in data/state loss, it may also be something that you, yourself can't immediately fix. And if it's temporary, what is the point of creating an issue and prioritizing.

I guess my point is that to any such categorization of errors or warnings there are way too many counter examples to be able to describe them like that.

So I'd usually think that Errors are something that I would heuristically want to quickly react to and investigate (e.g. being paged, while Warnings are something I would periodically check in (e.g. weekly).

eterm•11m ago
This might be controversial, but I'd say if it's fine after a retry, then it doesn't need a warning.

Because what I'd want to know is how often does it fail, which is a metric not a log.

So expose <third party api failure rate> as a metric not a log.

If feeding logs into datadog or similar is the only way you're collecting metrics, then you aren't treating your observablity with the respect it deserves. Put in real counters so you're not just reacting to what catches your eye in the logs.

If the third party being down has a knock-on effect to your own system functionality / uptime, then it needs to be a warning or error, but you should also put in the backlog a ticket to de-couple your uptime from that third-party, be it retries, queues, or other mitigations ( alternate providers? ).

By implementing a retry you planned for that third party to be down, so it's just business as usual if it suceeds on retry.

sysguest•18m ago
hmm maybe we need extra representation?

eg: 2.0 for "trace" / 1.0 for "debug" / 0.0 for "info" / -1.0 for "warn" / -2.0 for "error that can be handled"

teo_zero•24m ago
This doesn't resonate with my experience. I place the line between a warning and an error whether the operation can or can't be completed.

A connection timed out, retrying in 30 secs? That's a warning. Gave up connecting after 5 failed attempts? Now that's an error.

I don't care so much if the origin of the error is within the program, or the system, or the network. If I can't get what I'm asking for, it can't be a mere warning.

AndroTux•21m ago
“cannot contact port 25 on <remote host>” may very well be a configuration error. How should the program know?
HankB99•14m ago
Would it make sense to consider anything that prevents a process from completing it's intended function an error? It seems like this message would fall into that category and, as you pointed out, could result from a local fault as well.
kijin•12m ago
SMTP clients are designed to try again with exponential backoff. If the final attempt fails and your email gets bounced, now that's an error. Until then, it's just a delay, business as usual.
notatoad•3m ago
>How should the program know?

if we're talking about logs from our own applications that we have written, the program should know because we can write it in a way that it knows.

config should be verified before it is used. make a ping to port 25 to see if it works before you start using that config for actual operation. if it fails the verification step, that's not an error that needs to be logged.

mkoubaa•6m ago
To me it's always a neat trick when you're not allowed to use print() in production code