However: Don’t underestimate community support (in the areas you’re likely to want it) when comparing development stacks.
What’s the point in having 64 Gb of DDR5 and 16 cores @ 4.2 GHz if not to be able to have a couple electron apps sitting at idle yet somehow still using the equivalent computational resources of the most powerful supercomputer on earth in the mid 1990s.
Oh and put everything behind the strictest cloudflare settings you can, so that even a whiff of anything that’s not a Windows 11 laptop or iPhone on a major U.S. network residential or mobile IP gets non-stop bot checks!
Yeah, I'm not worried about being targeted in an RCA and pointedly asked why I chose a region with way better uptime than `us-tirefire-1`.
What _is_ worth considering is whether your more carefully considered region will perform better during an actual outage where some critical AWS resource goes down in Virginia, taking my region with it anyway.
AWS Organizations/Account management is us-east-1.
And if you want a CDN with a custom hostname and want TLS…you have to use us-east-1.
CloudFront CDN has a similar setup. The SSL certificate and key have to be hosted in us-east-1 for control plane operations but once deployed, the public data plane is globally or regionally dispersed. There is no auto failover for the cert dependency yet. The SLA is only three 9s. Also depends on Route53.
The elephant in the room for hyperscalers is the potential for rogue employees or a cyber attack on a control plane. Considering the high stakes and economic criticality of these platforms, both are inevitable and both have likely already happened.
Separately from that, if you are trying to move certain types of non-mainstream IBM workloads to cloud (AIX, IBM i, z/OS) then IBM is tier 1 in that case
us-east-2 is objectively a better region to pick if you want US east, yet you feel safer picking use1 because “I’m safer making a worse decision that everyone understands is worse, as long as everyone else does it as well.”
If you never get blamed for a US east outage, that's better than us-east-2 if that could get you blamed 0.5% of the time when it goes down and us1 isn't down or etc
I can’t tell if it’s you thinking this way, or if your company is setup to incentivize this. But either way, I think it’s suboptimal.
That’s not about “risk profile” of the business or making the right decision for the customer, that’s about risk profile of saving your own tail in the organizational gamesmanship sense. Which is a shame, tbh. For both the customer and for people making tech decisions.
I fully appreciate that some companies may encourage this behavior, and we all need a job so we have to work somewhere, but this type of thinking objectively leads to worse technology decisions and I hope I never have to work for a company that encourages this.
Edit: addressing blame when things go wrong… don’t you think it would be a better story to tell your boss that you did the right thing for the customer, rather than “I did this because everyone else does it, even though most of us agree it’s worse for the customer in general”. I would assume I’d get more blame for the 2nd decision than the 1st.
See any companies getting credit for it in the last AWS outage? I didn't. My employers didn't reward vendors who stayed up during it.
Shame about your employer, though.
US-East-2 staying up isn’t my responsibility. If I need my own failover, I’m going to select a different region anyway.
And it’s not like US-East-2 isn’t already huge and growing. It’s effectively becoming another US-East-1.
No, but you can be blamed if other things are up and yours is not. If everyone's stuff is down, it is just a natural disaster.
If my cloud provider goes down and also takes down Spotify, Snapchat, Venmo, Reddit, and a ton of other major services that my customers and my boss use daily, they will be much more understanding that there is a third party issue that we can more or less wait out.
Every provider has outages. US-east-2 will sometimes go down. If I'm not going to make a system that can fail over from one provider to another (which is a lot of work and can be expensive, and really won't be actively used often), it might be better to just use the popular one and go with the group.
The regions provide the same functionality, so I see genuinely no downside or additional work to picking the 2 regions over the 2 regions.
It seems like one of those no brainer decisions to me. I take pride in being up when everyone else is down. 5 9s or bust, baby!
There's just not much motivation left to do better systems.
“Duh, because there’s an AZ in us-east-1 where you can’t configure EBS volumes for attachment to fargate launch type ECS tasks, of course. Everybody knows that…”
:p
So if you tried to be "smart" and set up in Ohio you got crushed by the thundering herd coming out of Virginia and then bit again because aws barely cares about you region and neither does anyone else.
The truth is Amazon doesn't have any real backup for Virginia. They don't have the capacity anywhere else and the whole geographic distribution scheme is a chimera.
Makes one wonder, does us-west-2 have the capacity to take on this surge?
Is this from real experience of something that actually happened, or just imagined?
The only things that matter in a decision are:
* Services that are available in the region
* (if relevant and critical) Latency to other services
* SLAs for the region
Everything else is irrelevant.
If you think AWS is so bad that their SLAs are not trustworthy, that's a different problem to solve.
Big fail.
I have said for years, never ascribe to terrorism what can be attributed to some backhoe operator in Ashburn, Virginia.
We got a lotta backhoes in northern Virginia.
We've started to see some rather interesting consequences for grid reliability: https://blog.gridstatus.io/byte-blackouts-large-data-center-...
At this point my garage is tied for reliability with us-east-1 largely because it got flooded 8 month ago.
if you are using hetzner: avoid everything other than fra region, ideally pray that you are placed in the newer part of the datacenter since it has the upgraded switching spine I haven't seen the old one in a bit so they might have deprecated it entirely.
In the early days of cross-region inference, less people were using it, and there was basically no monitoring (and/or alerting) on Amazon's side.
The cross-region and global inference routing is... odd at times.
- Is X region and its services covered by a suitable SLA? https://aws.amazon.com/legal/service-level-agreements/
- Does X region have all the explicit services you need? (note things like certs and iam are "global" so often implicitly US-East-1)
- What are your PoP latency requirements?
- Do you have concerns about sovereign data: hosting, ingress, and egress? https://pages.awscloud.com/rs/112-TZM-766/images/AWS_Public_...
This analysis is skewed due to the major incident in 2025. What was the data for 2024 and over the last, say, 5 years? So the proclamation of least reliable of us-east-1 is based on 1 year of data, and it’s probably fair to say that at least last 3 years if not 5 are a better predictor of reliability.
us-east-1 also hosts some special things, so it will have more services to lose.
There aren't that many businesses that truly can't handle the worst case (so far) AWS outage. Payment processing is the strongest example I can come up with that is incompatible with the SLA that a typical cloud provider can offer. Visa going down globally for even a few minutes might be worse than a small town losing its power grid for an entire week.
It's a hell of a lot easier to just go down with everyone else, apologize on Twitter, and enjoy a forced snow day. Don't let it frustrate you. Stay focused on the business and customer experience. It's not ideal to be down, but there are usually much bigger problems to solve. Chasing an extra x% of uptime per year is usually not worth a multicloud/region clusterfuck. These tend to be even less resilient on average.
It’s kind of amazing that after nearly 20 years of “cloud”, the worst case so far still hasn’t been all that bad. Outages are the mildest type of incident. A true cloud disaster would be something like a major S3 data loss event, or a compromise of the IAM control plane. That’s what it would take for people to take multi-region/multi-cloud seriously.
So like the OVH data center fire back in 2021?
(No shade on OVH, but they are ~1% market share player)
https://arstechnica.com/information-technology/2011/04/amazo...
You mean like stealing the master keys for Azure? Oh wait a minute...
You forget things like emergency services. If we were to rely on AWS (even with a backup/DR zone in another region), and were to go down with everyone else and twiddle our fingers, houses burn down, people die, and our company has to pay abatements to the govt.
[0]: https://www.datacenters.com/providers/amazon-aws/data-center...
david_shaw•1mo ago
It's often seen as the "standard" or "default" region to use when spinning up new US-based AWS services, is the oldest AWS center, has the most interconnected systems, and likely has the highest average load.
It makes sense that us-east-1 has reliability problems, but I wish Amazon was a little more upfront about some of the risks when choosing that zone.
Forgeties79•1mo ago