Ask HN: GCP Outage?

86•grilledchickenw•6mo ago

Status page is, as expected, all green, but anyone noticing anything unusual? Services on cloud run timing out for me.

Comments

staletofu•6mo ago

Can't login with google to langsmith at the moment and gcp login is loading either. Seems like there is something afoot.

jeanlucas•6mo ago

Some issues here in Brazil

hu3•6mo ago

One of my multi-region clients is also affected by Brazil GCP.

tosh•6mo ago

Firebase Firestore is either down or very high latency in us-east1

ghxst•6mo ago

Multiple people within our company reporting issues. Mostly from US, us in the EU still seem fine as of right now.

edit: Never mind, it's down for me now as well.

archiolidius•6mo ago

YouTube API partially down

romanzubenko•6mo ago

We first noticed google login issues with our app, can't login with google anywhere now, Google Analytics is down as well.

jeanlucas•6mo ago

Multiple people in Brazil reporting:

- SSO issues;

- Google workspace tools not loading;

current time: 2025-07-18T15:35:43+00:00 12h35 GMT-3

ChrisArchitect•6mo ago

https://www.google.com/appsstatus/dashboard/incidents/oFcAZT...

dondraper36•6mo ago

https://status.cloud.google.com/incidents/8cY8jdUpEGGbsSMSQk...

Seems to be some hardware problem at least in us-east1

jbreckmckye•6mo ago

Someone unplugged the Big Router

bn-l•6mo ago

Every ui in gcp is ugly and painful and slow. Why?

palcu•6mo ago

There is an external incident now.

https://status.cloud.google.com/incidents/8cY8jdUpEGGbsSMSQk...

lebski88•6mo ago

Our VPN restarted about an hour ago and caused a bit of excitement, on the whole it's been a lot less _interesting_ than the last one thankfully.

dangoodmanUT•6mo ago

Reminder that multi cloud >>> multi region

Anyone who says otherwise is selling availability theater

Too many whole-cloud outages due to a bad config in the last 2 months (GCP x2, cloudflare x2)

jonathaneunice•6mo ago

And also that effort(multi cloud) >>> effort(multi region)

18172828286177•6mo ago

This isn’t a whole-cloud outage. It’s not even a whole-region outage.

Whole-cloud outages are pretty damn rare. The recent GCP issues are an exception to the general rule.

I’d posit that the complexity of a multi-cloud setup is generally going to reduce your service’s reliability more than relying on a single cloud does.

remram•6mo ago

Whole-zone outages are also rare...

FrankPetrilli•6mo ago

"Rarity" is a distinction without merit in this particular case; the important thing to note is that (most) clouds don't guarantee _any_ availability of a single zone. A system which stashes all of its infrastructure in one zone only is expected to be impacted by issues with that cloud, while a multi-zone setup spanning a region is generally "soft-guaranteed" to be resilient to normal operations / failures.

remram•6mo ago

> (most) clouds don't guarantee _any_ availability of a single zone

Really?

AWS (EC2) does: https://aws.amazon.com/compute/sla/?did=sla_card&trk=sla_car... so does GCP (GCE): https://cloud.google.com/compute/sla?hl=en and so does OVH: https://us.ovhcloud.com/legal/sla/public-cloud/

Are none of those three part of "most clouds"? What cloud platform do you use?

dangoodmanUT•6mo ago

Not about regions, it’s about services

JohnMakin•6mo ago

I've maintained a large multi-cloud architecture in the past. The problem is they really hit you hard on egress costs. Of course the motivation is obvious, they want to keep you locked in to their vendor. I did like that it gave a stronger leverage in contract renewals, but that was about it. The IAC was much more complicated and required more people/areas of knowledge. So it's definitely a tradeoff.

You are correct that it's "better" though if your goal is to have as many 9's of uptime as possible.

mads_quist•6mo ago

I currently have the strong opinion that for many mid-sized orgs with 250+ engineers it can be more resilient if you go back to bare metal or at least VM only in two or three local date centers. Yes, you need to know that they do their job well. But it will probably also reduce a lot of devops overhead...

dilyevsky•6mo ago

There are multiple companies that help you with that by running tunnels via Direct Interconnect (Direct Connect in AWS) so that you "only" pay 2c/G egressing data out of VPC via this tunnel

JohnMakin•6mo ago

yes, direct connect I have quite a bit of experience with. The costs add up in weird ways. if you want to spend on it though, multi cloud is extremely resilient, and my preferred architecture if money and talent are no object.

dilyevsky•6mo ago

Totally. You need some serious volume to make it worth it because they charge you like $30/hr per 10g allocation iirc

romanhn•6mo ago

I worked at PagerDuty, so definitely not selling availability theater. We did multi-cloud / multi-region for many years, and the story is not so simple. Development is all about trade-offs, and deciding what risk you are OK with. Multi-cloud provided a relatively small amount of value (given how incredibly unlikely whole-cloud outages are, even full-region outages are quite rare) at the expense of 2x implementation overhead, 2x exposure to random cloud-specific operational events, and the need to develop for the common denominator of functionality, which leaves out a LOT of interesting cloud offerings. In the end, it ended up just not being worth it, and moving to the single-cloud multi-region config provided enough reliability even for the company where reliability is the primary differentiator.

In my current job as a technical due diligence advisor, I frequently recommend multi-AZ setup but specifically not multi-region, because the former is easy and worthwhile while the latter carries a lot more operational overhead (you become much more sensitive to various latencies and network jitters) and you now need to think about things like synchronous vs async replication, etc. Much better to focus dev effort on the product, rather than eke out an additional .001% of availability (unless availability is a super critical component).

Ironlink•6mo ago

Our system in EU observed some slowness and a few 500 and 503 responses from `identitytoolkit.googleapis.com` over a period of about 10 minutes.

blitzar•6mo ago

Systems down, heading to the pub.

ge96•6mo ago

Every time pager duty hits, take a shot

mbf•6mo ago

I forgot all about pager duty... been retired over a year now. I don't miss pager duty.

CoastalCoder•6mo ago

> I forgot all about pager duty

Probably because it's hard to form long-term memories when you're sleep-deprived :/

blitzar•6mo ago

or 8 shots in

dijit•6mo ago

Maybe centralising all our IT infrastructure wasn't a good idea after all.

kenmacd•6mo ago

I dunno. If just your employers site is down then you'll be expected to fix it, whereas if everyone is down there's less pressure.

dpkirchner•6mo ago

yup. I figure I'm basically a free-rider (except I am paying a relatively small amount.)

dijit•6mo ago

Nobody who talks to actual stakeholders can use this as a defence.

B2B customers don’t care if the other sites are also down, your SLA is affected with them, and they will want compensation.

hadlock•6mo ago

You need to phrase it as Internet Weather.

gsliepen•6mo ago

It was an act of Google^H^H^H^Hd.

freedomben•6mo ago

Definitely been seeing a handful of 50x errors this morning. Fortunately seems like a partial outage but definitely annoying (and can sometimes indicate worse trouble coming)

abhisek•6mo ago

Yes. Many times. Kubernetes upgrade during maintenance schedule borks up entire cluster, yet everything is green on status page. Support case under enterprise support plan took almost 6 hours to get it resolved.

dilyevsky•6mo ago

I see they made great strides in the past 5 years - it used to take days =)

Ask HN: Opus 4.6 ignoring instructions, how to use 4.5 in Claude Code instead?

Ask HN: Anyone Using a Mac Studio for Local AI/LLM?

Ask HN: Ideas for small ways to make the world a better place

Ask HN: Non AI-obsessed tech forums

Ask HN: 10 months since the Llama-4 release: what happened to Meta AI?

Ask HN: Who wants to be hired? (February 2026)

LLMs are powerful, but enterprises are deterministic by nature

Ask HN: Who is hiring? (February 2026)

AI Regex Scientist: A self-improving regex solver

Ask HN: Non-profit, volunteers run org needs CRM. Is Odoo Community a good sol.?

Tell HN: Another round of Zendesk email spam

Ask HN: Is Connecting via SSH Risky?

Ask HN: Has your whole engineering team gone big into AI coding? How's it going?

Ask HN: How does ChatGPT decide which websites to recommend?

Ask HN: Why LLM providers sell access instead of consulting services?

Ask HN: Is there anyone here who still uses slide rules?

Ask HN: Mem0 stores memories, but doesn't learn user patterns

Ask HN: What is the most complicated Algorithm you came up with yourself?

Ask HN: Is it just me or are most businesses insane?

Kernighan on Programming

Ask HN: Anyone Seeing YT ads related to chats on ChatGPT?

Ask HN: Does global decoupling from the USA signal comeback of the desktop app?

We built a serverless GPU inference platform with predictable latency

Ask HN: Does a good "read it later" app exist?

Ask HN: Have you been fired because of AI?

Ask HN: Anyone have a "sovereign" solution for phone calls?

Ask HN: Cheap laptop for Linux without GUI (for writing)

Ask HN: Any International Job Boards for International Workers?

Ask HN: How Did You Validate?

GitHub Actions Have "Major Outage"

Ask HN: Opus 4.6 ignoring instructions, how to use 4.5 in Claude Code instead?

Ask HN: Anyone Using a Mac Studio for Local AI/LLM?

Ask HN: Ideas for small ways to make the world a better place

Ask HN: Non AI-obsessed tech forums

Ask HN: 10 months since the Llama-4 release: what happened to Meta AI?

Ask HN: Who wants to be hired? (February 2026)

LLMs are powerful, but enterprises are deterministic by nature

Ask HN: Who is hiring? (February 2026)

AI Regex Scientist: A self-improving regex solver

Ask HN: Non-profit, volunteers run org needs CRM. Is Odoo Community a good sol.?

Tell HN: Another round of Zendesk email spam

Ask HN: Is Connecting via SSH Risky?

Ask HN: Has your whole engineering team gone big into AI coding? How's it going?

Ask HN: How does ChatGPT decide which websites to recommend?

Ask HN: Why LLM providers sell access instead of consulting services?

Ask HN: Is there anyone here who still uses slide rules?

Ask HN: Mem0 stores memories, but doesn't learn user patterns

Ask HN: What is the most complicated Algorithm you came up with yourself?

Ask HN: Is it just me or are most businesses insane?

Kernighan on Programming

Ask HN: Anyone Seeing YT ads related to chats on ChatGPT?

Ask HN: Does global decoupling from the USA signal comeback of the desktop app?

We built a serverless GPU inference platform with predictable latency

Ask HN: Does a good "read it later" app exist?

Ask HN: Have you been fired because of AI?

Ask HN: Anyone have a "sovereign" solution for phone calls?

Ask HN: Cheap laptop for Linux without GUI (for writing)

Ask HN: Any International Job Boards for International Workers?

Ask HN: How Did You Validate?

GitHub Actions Have "Major Outage"

Ask HN: GCP Outage?

Comments