With that being said, I find these kinds notifications to provide more false positives than correctly detecting downtime. That ends up costing more time checking/double checking.
On the other hand, if you are running a service with no users and you have downtime... did you really have downtime?
If you run a service and you have downtime and no one reports it, did you have downtime?
I don't even check for my services. If something goes down, I'll find out via email from one or more of my customers. It happens very rarely.
There are services like Textbelt that leave the trigger mechanisms all up to you and your local tools:
You bring up a good point. I think it to be less of a problem for more established companies that don't face unexpected outages too often. When we were starting out with our mobile app however this wasn't the case, and each outage meant downloads lost which were critical for getting early feedback. I see it as a bigger pain point for early founders/small teams whose server could see a lot of volatility.
So far we haven't encountered any false positives (been using it for around 6 months) but perhaps with the wrong endpoint that could be a problem. I'll keep an eye out for that.
Unless you have 1 customer who lives in the same time zone or something.
I'd understand not wanring to be being woken up from a page for a small operation though.
The architecture uses scalable AWS serverless components (Lambda, SQS, DynamoDB) and is well-suited to handle a large increase in monitored endpoints. The primary scaling mechanism is the automatic concurrency scaling of the Lambda functions processing messages from SQS queues. Should we scale to 10,000 endpoints we do expect some bottlenecks that would require optimizing i.e. increasing lambda timeouts/memory etc. but we'll cross that bridge when we get to it.
For the actual sms sending our numbers can send up to 100 sms texts/second.
If AWS goes down, your site and mine both go down together. This was basically why Pagerduty got out to an early win -- they never used AWS when everyone else did.
But if something crazy like the 2023 outage happens again then you're absolutely right. Though you'd likely get a news alert for it - our fallback :)
If we get enough traction we'll look into a multi-cloud setup to mitigate that risk. For now our goal is to help with notifying you when your server goes down due to more common reasons.
This tool wouldn’t be useful for most (if not) all enterprise services I’ve worked for. For enterprise, you want fully featured synthetics services such as Thousand Eyes, plus an internal monitoring and alerting system.
Also you typically don’t want to expose your health endpoint to the outside world. It’s a security risk.
So not aiming for enterprise on this one, made the pricing quite accessible and with minimal features.
For the health endpoint as long as it only returns a 200 status code (without disclosing info like tokens or resource info/server configurations) then the risk is very minimal.
Also Tianji[1], self-hosted in the same space.
Of course, these require setup and have to be hosted somewhere. Either another server or deployed to Heroku (Sealos, Render, etc), since they obviously cannot be deployed on the server you wish to monitor!
SantiagoVargas•9mo ago
For our app it's super important as if our server goes down, users can download the app but get stuck at the sign in flow. There's subscription services out there that do more in-depth monitoring but this is all we needed.
I listed an alternative solution below for those wanting to build or customize their own solution, ours just gets the job done, is quick to set up, and you can avoid the monthly twilio/sms fees.
Other alternative we received as feedback for those interested: "If any one wants an AWS Native way and assuming it has ALB you can target elb metric 503 via Cloudwatch Alarm and create an output to an SNS topic that goes to Slack, or use AWS chatbot/q, or set number as destination for sms via sns"
lurk2•9mo ago
SantiagoVargas•9mo ago
Zanfa•9mo ago
SantiagoVargas•9mo ago
lurk2•9mo ago
SantiagoVargas•9mo ago
thinkingemote•9mo ago
SantiagoVargas•9mo ago
Almost all monitoring services I found target enterprise, and the ones that don't are self-hosted. This solution is for the small teams/indie devs that just need to know when their servers down. Might raise the price though, thinking the low price might work against me for quality perception. What do you think?
thinkingemote•9mo ago
One thing with the events model is that for some webhosts which do maintenance or small periods of downtime often a user might see many through the year. Or in other words the hobby dev might see them but I imagine a production level small team shouldn't be using those servers anyway
abcd_f•9mo ago
That is, it will be really hard for you to find customers.
SantiagoVargas•9mo ago