Railway Is Having a Major Outage

https://status.railway.com/#/

71•kgraves•2h ago

Comments

iloveplants•2h ago

seems like it's every day

throwaranay4933•2h ago

This screenshot from Discord suggests the idea that the outage is caused by automated GCP account ban: https://x.com/acgfbr/status/2056866780866351323

faangguyindia•1h ago

Google cloud also locked out a Korean Goverment Organization recently. The guy posted on GCP subreddit.

Google really need to improve their support team. It's strange such a big corp can't even afford to have proper support team.

King-Aaron•1h ago

> It's strange such a big corp can't even afford to have proper support team

This seems to be by design.

ndneighbor•24m ago

We have a CSM, Head of Customer Support contact, and further contacts with GCP. Despite that, we still had this issue.

danpalmer•43m ago

> It's strange such a big corp can't even afford to have proper support team

Railway say they are in touch with that support team.

shooker435•9m ago

god help them

add-sub-mul-div•15m ago

Automating support, automating everything is the key to their whole deal. Tech giants leapfrogged the rest of the economy by innovating a company that can scale its customers without having to scale itself proportionally.

benwoodward•4m ago

pretty sure their support team is a flaky ML model that is haplessly flagging random accounts

choilive•2m ago

Not strange, Google has never had a proper support team unless you are an "Enterprise" level customer.

mcontrerazCL•1h ago

all my fkn postgres bd in railways! what do i do now?

cactusplant7374•53m ago

Take a walk. Breathe in the fresh air. It feels good.

eoswald•28m ago

Hahah at least you're not getting called every five minutes because you cant shut off the alerts, because its apparently deployed SOMEWHERE but good luck finding how to access it. Can't wait to see the bill from Twilio because of this lol

ryanisnan•1h ago

Yikes. I was wondering why my TLS certs were coming up as invalid.

eoswald•1h ago

Sorry, I have a hard time blaming Google for this, when Railway seems to be having increasing trouble keeping the platform stable. Something like this should NOT take down an ENTIRE service. There should be a backup when literally your business is about being the reliable backend. This just seems like poor planning to me.

cactusplant7374•55m ago

Disaster recovery is pretty expensive, right? Especially for their size.

ryanisnan•50m ago

I don't quite know what you mean. Do you really expect Railway to use a multi-cloud architecture to host all of their client's projects? I suspect that would lead to a lower availability, all things considered.

impulser_•41m ago

They literally own their own data centers. That's whats surprising about this. They are lying to their customers when they say they operate their own data center because obviously they don't if everyone's apps are down with GCP blocking their account.

ryanisnan•39m ago

Oh, I see what you mean. Eh, it's possibly the same reason that AWS essentially goes down when us-east-1 goes down.

brookst•17m ago

Is it not possible that they own their own data center and have an unfortunate Google dependency?

Obviously a fiasco but I’m not prepared to call them liars when it could be an honest mistake.

Terr_•9m ago

I imagine there's also an important difference between:

1. We depend on X but could gracefully migrate to an alternate in a week if we really needed to.

2. All data is mirrored instantly so that we can do seamless fail-over in case X has its own outage.

eoswald•39m ago

Well, in the same token, is it smart to base your ENTIRE architecture on a single cloud architecture? Isn't that why some of us build in fallbacks for AWS-hosted services? I mean, their enitre platform, both public and private facing, is running on the same thing. One error, one problem, takes out the entire service.

irjustin•16m ago

Taking this at face value, this doesn't happen to AWS clients - at least I don't read about it here.

AWS may have data centers[0] go[1] down[2], but that's within expected bounds of standard ops.

[0] https://hooks.slack.com/services/TJ7HQS7FC/B0B5S7UTBJ4/PUHIC...

[1] https://www.aljazeera.com/news/2025/10/21/what-caused-amazon...

[2] https://netflixtechblog.com/lessons-netflix-learned-from-the...

brokenodo•52m ago

I’m a new customer and have been falling in love with Railway over the last 2 weeks, but this is quite the wake up call.

csw-001•50m ago

Literally in the same boat. I've been really happy with it, but this is a major eye opener.... It's been done for a looooong time by provider standards.

reelvideocap•44m ago

same

TheAtomic•18m ago

same same

Mengkudulangsat•50m ago

That explains why all my vibe-coded hobby projects are down.

Thank God I'm not dealing with any public-facing sites! Would have been an expensive lesson for a newbie coder if my job depended on this.

rekabis•47m ago

TL;DR: putting all your eggs into one basket is bad, man.

lfx•39m ago

That’s true, however having only few eggs and shopping for several baskets does not make sense in early days. Not sure how big railway is, but usually you start small with one egg.

christophilus•28m ago

You’d think they wouldn’t have started with GCP. There are plenty of datacenters where you can buy racks and racks of servers, and talk to a human when something goes wrong, and even walk in and access your servers. That’s what I’d be using if I were to build a Rackspace today.

tomschlick•21m ago

They started on GCP and have been migrating to their own "Metal" DC doing exactly what you're describing. But GCP is still their overflow given how rapidly they are growing and holds some amount of networking that routes to their DC.

bshack0•47m ago

so....what are we switching to y'all? cloud-run ? ;P

auxiliarymoose•37m ago

federated hardware (a bunch of raspberry pis networked into a high availability kubernetes cluster, hidden across various local coffee shops for free power and bandwidth)

throwatdem12311•34m ago

raspberry-pi cluster in my closet

bshack0•46m ago

so...what are we switching to yall? cloud-run :P

enahs-sf•42m ago

I respect what railway is doing but also would never run my business on such a platform.

dpark•36m ago

That kind of sounds like you don’t respect what they are doing.

eoswald•34m ago

Today changed my opinion on them completely. Was willing to give them the benefit of the doubt that they're growing fast, but now seeing that they've failed to scale properly, and are missing little things that become big things later. I can't take that risk.

fjni•39m ago

Wait… railway runs on GCP? Didn’t they make a whole thing about not “building a cloud on top of another cloud?”

Or did they just mean that they’re not renting VPSs but only metal from the cloud provider?

In my mind I was so excited that there was another provider not just paying one of the hyperscalars but at a minimum colocating and owning more of their stack. https://blog.railway.com/p/heroku-walked-railway-run

eoswald•36m ago

Yep, and this is why I'm pissed. They lied. They're completely dependent on GCP. So, I gotta do some research, i need something a little more stable (and less dependent on one company's whims) than this. This is bad for them, because it really strikes at the heart of their 'big claim,' peacefull software deployments. This is chaos.

ndneighbor•25m ago

Yea, I mean, that's the whole MO of our platform and we failed at that. So yea, that's disappointing and more so for our customers.

I can provide an explanation about the GCP dependency. Yes, we have host workloads off GCP, and we have been able to build a good business by performing a cloud exit. However, we were worried that we would have a circular dependency on our own cloud. I don't think we expected to get auto-modded out of our own account, hence we left our DB on CloudSQL.

It was never our intent to deceive people that we didn't own our own destiny with our business. The last GCP issue, we were assured that this scenario wouldn't happen (when we got auto-ratelimited, which was bad, but survivable) - but it seems like we have further work to do. Apologies.

fontain•19m ago

I’m very sympathetic and understand that decisions are easy to criticize in hindsight but leaving your database in GCP while moving everything else to your own data centres seems so backwards I can’t even begin to imagine how that could happen. Was this really an intentional design decision?

ndneighbor•10m ago

> decisions are easy to criticize in hindsight

I mean, the pain we have caused our customer ultimately proves you correct. That said, we made our decisions with the information and constraints that we knew in that moment in time. Railway has hosts in AWS/GCP/and co-los, so coordinating those workloads in a fully distributed manner would be ideal but end of the day, we didn't forsee that would just have our project get deleted just like that.

(Even if we did get assurances from them in 2024, that it wouldn't happen again, although we just got auto-rate limited the last time.)

r_lee•1m ago

could you clarify, did an automated process by Google delete a GCP project/account/resource(s)? like, what exactly were you seeing when trying to get access or see what happened?

arjie•6m ago

I have exactly the same architecture. You can easily administer a postgres/mysql on your own infrastructure, but it's also the one thing where backups and availability are super strict. I can easily support multi-region in Google Cloud or AWS and that's way harder to do on-prem, and it's also hard to handle the replication story as safely as with Google Cloud. The hope is that GCP et al. give you safety and availability for the control plane stuff and you can run your data plane on-prem.

At $2m/mo spend, this kind of thing is insane. GCP has never been the most reliable of clouds but this is pretty awful. I would never have expected this.

miniman1337•30m ago

from the blog linked via Wayback Machine. "From Day 1, we had this notion at the forefront.

The other notion that we have intuited is that you can’t build a cloud on another cloud. We have devoted years of practice running our own metal (and playing well with other clouds) to make sure that Railway’s business, which invariably becomes your customer’s business, is as rock solid as possible."

Avicebron•37m ago

Isn't Railway the "the API key to delete the backups is in the prod database, because that's where the backups live duh" guys?

TheTaytay•15m ago

I’ve seen a few smug “all your eggs in one basket” comments here.

I’m aware of some companies hosting their own metal and infra, but I’m not aware of large companies mitigating risk by hosting on separate cloud providers as a fallback mechanism. We might disagree with cloud provider choice, or think they should have been hosting their own metal, but that’s still an “all your eggs in one basket” choice, right?

Heck, they might even have multi-region fallback with GCP, but if GCP bans your account, that doesn’t matter.

Are there good examples of running a company of railway’s size so redundantly that their host could nuke one of their accounts and they’d just keep on trucking?

fontain•12m ago

They do run their own metal. That’s their entire ethos. Railway is their own cloud.

chradams•8m ago

Just google multi-cloud. Yes. It's a thing.

dwa3592•12m ago

Wait, I thought railway was a cloud provider like AWS, GCP but better and more agile. At least that's the impression i got from their website.

whh•10m ago

This could kill a startup. I really don't like Google's automated and silent account murder functionality.

Drew-Aetherwave•9m ago

It is killing me...

Osborn_Ojure•4m ago

compute recovered, get ready boys!

Gemini 3.5 Flash

Railway Blocked by Google Cloud

I’ve built a virtual museum with nearly every operating system you can think of

Google changes its search box

OpenAI Adopts Google's SynthID Watermark for AI Images with Verification Tool

Remove–AI–Watermarks – CLI and library for removing AI watermarks from images

Show HN: Forge – Guardrails take an 8B model from 53% to 99% on agentic tasks

Mistral AI acquires Emmi AI

Apple unveils new accessibility features

GitHub is investigating unauthorized access to their internal repositories

Minnesota becomes first state to ban prediction markets

The Mercury logic programming system

Dumb ways for an open source project to die

Growing Neural Cellular Automata

I’ve joined Anthropic

Show HN: Gaussian Splat of a Strawberry

Tool mapping 90 companies in the photonics and CPO supply chain

Lisp in Web-Based Applications (2001)

Unusual uses of OEIS sequences on GitHub

The two oldest printing presses

CISA Admin Leaked AWS GovCloud Keys on GitHub

Why is almost everyone right-handed? A new study connects it to bipedalism

Disney erased FiveThirtyEight

Copy Fail, Dirty Frag, and Fragnesia kernel vulnerabilities

The foundations of a provably secure operating system (PSOS) (1979) [pdf]

Gemini CLI will stop working from June 18, 2026

The TTY Demystified (2008)

Intro to TLA+ for the LLM Era: Prompt Your Way to Victory

Hanoi’s humble beer glass and the memory of a nation

Gemini Omni