"Mal de muchos, consuela de tontos."
"The issue is related to heating/cooling complications in the data center due to a power outage . The power outage has been fixed and we are working quickly to bring our services back online."
My infrastructure is redundant and spread out among hosting providers and DCs so there's no real impact, but I'm pretty sure this is the longest outage I've ever had with any provider. And the communication level has been so dissapointing. 4 hrs to say it's a power / HVAC issue? Updates that basically just say we're still working on it since then.
- Many came back up yesterday. Most of the rest came back up this morning.
- All but two are back online. One of those is "Powered off" but can't be turned on because "Linode busy". The other is online but unreachable, same behavior as most of them during the outage.
- Three required me to put them in Rescue Mode and run fsck.ext4 -F /dev/sda to get them back online.
Also, I've taken it upon myself to PROPERLY implement a better, redundant backup strategy (since I was mainly relying on Linode's service, but now I feel I should go beyond that). I am using restic backing-up to a Backblaze bucket via the S3 interface. Nice thing is I can put all hosts into the same bucket and restic will organize by host but still full deduplication support. Not sure how much that'll net me but it's nice to have.
dreeves•6mo ago
upseo•6mo ago
mathrawka•6mo ago
dreeves•6mo ago
basilgohar•6mo ago
basilgohar•6mo ago
These are secondary effects of outage, not Linode directly, but caused by the outage itself.
cmg•6mo ago
Happy Sunday! Cleaning up the automatically-created maintenance/alert tickets generated by this is going to be a fun time.