Quite rare to hear this wise line these days. An I guess with AI coding assistant, this is only the beginning of this kind of horror story
If you're running on commodity hardware, sure; if you happen to be a $1T company that solders 3x marked-up RAM then that's definitely not true https://www.apple.com/shop/buy-mac/macbook-pro/16-inch-space... is the entry model, and clicking the 48GB option dials the price up to $3k
But it was a good call sending it to the cloud. Better than "my problem" it is something being "somebody else's problem"
We also found the same problem as OP with self hosting sentry. Each release would unleash more containers and consume more memory until we couldn't run anything on the 32gb server except Sentry.
We looked at both GlitchTip and BugSink and only settled on GlitchTip as it was maintained by a bigger team. Feature wise they were quite similar and both good alternatives.
So far so good with GlitchTip.
And thanks Op for making BugSink, the more alternatives the better.
Although with Bugsink (which is what came out of this origin story of annoyance) I'm aiming for _even more_ simple (1 Docker container at minimum, rather than 4), 30 seconds up and running etc.
This has to be self-hosted eventually either by you or Sentry themselves so the full cost of this is coming down somewhere. The planet is round and all that and there's no getting away from these inefficiencies, but you can kick the proverbial can down the road.
Also, they are incentivized to make the open product as hard to use as possible so color me surprised the self-hosted edition is hard to operate.
last time a similar thing was pointed out on HN Armin Ronacher (former Sentry) came out and pointed at the following issue:
https://github.com/getsentry/team-ospo/issues/232
but that is now closed with
> We are no longer working on this effort for now
One of the biggest pain points of running Sentry is the number of containers and a lot of this comes from how Sentry works with Kafka and Rabbit. That pain point is actively being addressed by moving to a virtualized system [1] though I am not sure how long it will take to actually adopt this.
I've seen you make this point on similar posts in the past, and I believe that you believe it.
The counterpoint would be that "the purpose of a system is what it does"...
[1]: this is because a) developing on sentry has become harder and harder on local machines and b) operating single-tenant installations for some customers brings many of the same challenges that people on self-hosted run into. c) we want more regions of Sentry and smaller ones (such as Europe today) have higher fixed costs than ideal.
It's frustrating when half the comments on a company that dares to open their product is always about how they are obviously intentionally very evil to not do it perfectly/for totally free/with 0 friction/etc.
How entitled have we become lol?
Maybe this virtualized system will make things easier. If so, that's great. But if it ends up not working out, or if it does, but over the longer term things get more difficult again, I think that's still just kinda how things happen sometimes, and that's ok.
One of Sentry's goals is for Sentry themselves to operate it as a hosted cloud service. Architecture decisions made to further that goal can naturally and reasonably be at odds with another goal to make it simpler to self host. Sometimes things can't be one-size-fits-all.
Not really. There are alternatives. Which seems to be the point of the article.
The overhead at low volume is pretty high, but in the higher volumes (25M transactions/24h) it's a massive cost saving for us.
Edit:
There were just some initial headaches with needing to increase kafka partitions and add replications to the transaction processors, otherwise we didn't quite leverage the available compute and the backpressure would fill Redis up until OOM.
Bugsink's also quite scalable[0], but I wouldn't recommend it a 25M/day.
Well, your homepage disagrees with this statement:
> Bugsink can deal with millions of events per day on dirt cheap hardware
But it's a very fuzzy way of quantifying something, and open to various interpretations.
For those interested in only errors, the self-hosted version recently introduced errors-only mode which should cut down on the containers.
I think this is a repeated question but... are you considering the cost of the people managing the deployment, security oversight, dealing with downtime etc?
Disclosure: I'm a sysadmin.
What I said is true for places where they already have sysadmins for various tasks. For the job I do (it's easy to find), you have to employ system administrations to begin with.
So, at least for my job, working the way I described in my original comment is the modus operandi for the job itself.
If the company you're working in doesn't prefer self-hosting things, and doesn't need system administrators for anything, you might be true, but having a couple of capable sysadmins on board both enables self-hosting and allows this initiative to grow without much extra cost, because it gets cheaper as the sysadmins learn and understand what they're doing, so they can handle more things with the same/less effort.
See, system administrators are lazy people. They'd rather solve problems for once and for all and play PacMan in their spare time.
Transactions like full user flows start to finish, or 1 transaction = 1 post/get and 1 response?
For most applications we are talking closer to 1 transportation 1 web request. Distributed tracing across microservices is possible, the level of extra effort required depends on your stack. But that's also the out of the box, plug and play stuff. With lower level APIs you define your own transactions, when they start and end, which is needed for tracing applications where there isn't a built in framework integration (e.x not a web application).
There's a ticket now open to stop this, but it's still in progress.
Forking has down sides that can't be hand waved away too, especially for a service like this.
Feel free to email - david at sentry
https://github.com/getsentry/sentry-dotnet/issues/3636#event...
Wow, that's really cheap. I'm seriously overpaying for my cloud provider and need to try Hetzner. I always assumed Hetzner was only European based.
FWIW, https://lowendbox.com/ is good fun for the former set of things, too
The argument that you have to read a sh script doesn't make sense to me. Are you gonna read source code of any software is referenced in this script or any you download too? No? What's the difference between that and a bash script, at the end of the day both can do damage.
Helm is a huge pain in the butt if you have mitigation obligations because the overall supply chain for a 1-command install can involve several different parties, who all update things at different frequencies :/
So chart A includes subchart B, which consumes an image from party C, who haven't updated to foobar X yet. You either need to wait for 3 different people to update stuff to get mainline fixed, or you roll up your sleeves and start rebuilding things, hosting your own images and forking charts. At first you build 1 image and set a value but the problem grows over time.
If you update independently you end up running version combinations of software that the OG vendor has never tested.
This is not helm's fault of course; it's just the reality of deploying software with a lot of moving parts.
> Application monitoring software considered "not bad" by 4 million developers.
Sounds pretty bad to me
Doesnt mean the complaints about self-hosted arent valid, but "literally has to scale to the most insane volumes of data" and "is not good software" are two different things.
We're building a cloud service at the end of the day - its a lot easier to optimize a multi-tenant install than it is a single-tenant install, and that complexity shows up in the self-hosted repo. Can't really avoid it.
I use Sentry with most of my clients, and for effective debugging I need to spin my own Sentry in a Docker container which ends up being quite heavy on my machine especially when combined with Grafana and Prometheus.
I'm really unhappy with virtually all monitoring/telemetry/tracking solutions.
It really feels they are all designed to vendor lock you in their expensive cloud solutions and I really don't feel I'm getting my $s back at all. Worst of all those cloud vendors would rather add new features non-stop rather than honing what they currently have.
Their sales are everywhere, I've seen two different clients getting Datadog sales people join private Slacks to evangelize their products.
Both times I escalated to the CTO, both times I ended up suspecting someone in management had something to gain from pushing teams to adopt those solutions.
Killing flies with hammers and all, but since I really like my hammer I actually do all my local development with my full-blown error tracker too:
I can only commend the hustle on their part, but it does feel a little like a high pressure time share situation.
david at sentry.io
As others have said, we've [0] found the only practical way to deploy this for our clients is Kuberentes + Helm chart, and that's on bare-metal servers (mostly Hetzner). It runs well if you can throw hardware and engineering time at it, which thankfully we can. But given the option we would love a simpler solution.
[0]: https://lithus.eu
But we specialise in this so that our clients don't have to. As much as I do actually love Kubernetes, the fact that the _easiest_ way to self-host Sentry is via Kubernetes is not a good sign. And choosing to spin up a Kubernetes cluster just to run Sentry would feel a lot like the lady who swallowed a fly[0].
[0]: https://en.wikipedia.org/wiki/There_Was_an_Old_Lady_Who_Swal...
That said I would honestly prefer if the industry would just settle on K8s as our OS.
I really do not see any benefit that sentry could bring on its own compared to a solid set of Helm charts for k8s.
I've been self-hosting Sentry for over 10 years: Sentry is installed by running `git clone`, editing one or two configuration files, and running ./install.sh. It requires 16GB RAM if you enable the full feature set. That includes automatically recording user sessions on web or mobile for replay, and support for end-to-end tracing (so seeing which database queries your api asks for in response to a button tap in your mobile app).
Sentry has a wonderful self-hosting team that's working hard to make Sentry's commercial feature set available, for free, to everyone that needs a mature error tracking solution. You can talk to them on discord and they listen and help.
Georg Hendrik = "George Henry", pretty common name. The fact that Google returned a result when you searched "Georg Hendrik Sentry" should not be considered weird.
Regarding using the SDKs, I'm telling my users to take Sentry at their word when they wrote "Permission is hereby granted, free of charge [..] to deal in the Software without restriction, including without limitation the rights to use"
OP built a product because they were frustrated by Sentry's seeming hostility toward self-hosting. It doesn't feel like OP decided to build a competing product and then thought it would be a good marketing strategy to falsely push the idea that Sentry is difficult to self-host.
FWIW I've never self-hosted Sentry, but poking around at their docs around it turns me off to the idea. (I do need a Sentry-like thing for a new project I'm building right now, but I doubt I'll be using Sentry itself for it.) Even if it's possible to run with significantly less than 16GB (like 1GB or so), just listing the requirements that way suggests to me that running in a bare-bones, low-memory configuration isn't well tested. Maybe it's all fine! But I don't feel confident about it, and that's all that matters, really.
this is indeed the timeline.
And then he does the exact same thing, on behalf of Sentry.
I hope he got paid for this. Otherwise it would just be sad.
I’m not sure how I feel about the license though (Polyform Shield, basically use-but-don’t-compete). It’s a totally valid choice – I just wish it would convert to FOSS at some point. (I understand the concern as I’ve had to make a similar decision when releasing https://lunni.dev/. I went with AGPL, but I’m still overthinking it :-)
- save post body to folders (use uuid as folder name to avoid spam)
- dir listing, and count number of entries
- render posted json to html, highlight stacktrace with js
- download raw json
- rotate, compress old entries.
I give those requirements to LLM, and I get a pretty much working rust implementation after few tweaks. It uses <5M ram idle.Any insights on why Sentry is so complex and needs so much resources? Is collecting, storing, and organizing errors messages and stack traces at scale difficult? Or it's the other features on top of this?
- they had enough money that they never needed to think seriously about maintenence cost, and the sales process was strokg enough to keep customers arriving anyway (look to Oracle for another example of hopelessly complicated installation process but people keep using it anyway)
- at some point someone realized this was actually a feature: the more complicated it got, the harder it became to self host. And from that perspective it is a win-win for the company: they can claim it is open source without being afraid that most people will choose to self host.
> actually a feature
I would guess that for a few people people (e.g. the ones who made the scary visual of rising costs) this is explicitly so, but for most people it's more implied. i.e. I don't think anyone advanced their career with Sentry by making self-hosting easier.
They have all sorts of caching, autoscaling, distributed systems and other stuff thats complete overkill for all except that largest installation. Plus all sorts of software features only needed by a few customers and extra layers to be multi-customer.
It's the difference between a hoop in your back yard and a NBA Stadium
As in, a huge SaaS company offers their product for self-hosting to individual companies, but it's not practical to self-host because the code is highly specialized for supporting hundreds of companies instead of just one? And it's hard to have an architecture that works well for just one and for hundreds?
For example Sentry requires ClickHouse, Postgres, Kafka, and Redis presumably because they were the right tools for their needs and either they have the resources to operate them all or the money to buy the managed options from vendors.
Also, the main concern people have with hosting Sentry is the sheer number of containers required but most of them are just consumers for different Kafka queues which again is presumably this way because Sentry ops prefers it this way, whether it be for fine tuning the scaling of each one or whatever the reason.
What makes sense for a SaaS company rarely translates to sensible for self-hosting.
I don't have servers like this but the author makes it easy to understand, and it applies to a lot of other things.
The Comparative Costs picture does tell it all.
The purpose of so much software is to reduce human and machine costs through time, and this apparently turned out to do just the opposite, apparently after long-term testing under real-world conditions.
Could be an unsurmountable fundamental structure of technical debt or something like that which metastasizes.
Then this, anyone could say about anything:
>I’m not going to run a piece of software that requires 16GB of RAM, has a complex installation script, and is known to be a pain to maintain. I’m not going to recommend it to anyone else either.
Easier said than done, sometimes it's the only option.
It's a "script". Maybe that's why you have to "rehearse" it more than once before you barely get it right, and then you might have to really go the extra mile (and have a bit of good fortune) before you can achieve a "command performance".
How do you think it feels these days to have to settle for this kind of thing in expensive proprietary software too?
It might not do everything Sentry does but it definitely has helped with tracking down some issues, even production ones and runs in a fairly manageable setup (in comparison to how long even the Sentry self-hosted Docker Compose file is).
What’s more, if you want, you can even use regular PostgreSQL as the backing data store (might not be quite as efficient as ElasticSearch for a metrics use case, but also doesn’t eat your RAM like crazy).
1: yes, I'm a huge k8s fanboi and yes I long every day for them to allow me to swap out etcd for something sane
Personally, no hate towards their BanyanDB but after getting burnt by OrientDB in Sonatype Nexus, I very much prefer more widespread options.
This lines up with my experience self hosting a headless BI service. In "developer mode" it takes maybe 1GB RAM. But as soon as you want to go prod you need multiple servers with 4+ cores and 16GB+ RAM that need a strongly consistent shared file store. Add confusing docs to the mix, mysterious breakages, incomplete logging and admin APIs, and a reliance on community templates for stuff like k8s deployment... it was very painful. I too gave up on self hosted.
This is caused by short sighted management that need to deliver and move on. "Long term" is a contradiction with their business model. In this case "long term" means "after product launch".
In my experience, solutions like Mailcow, which involve multiple services and containers (such as SMTP, IMAP, Redis, SSO, webmail, Rspamd, etc.), work very well. I have extensive experience running these systems, performing backups, restoring data, and updating their ecosystems.
Additionally, I've had a positive experience setting up and running a self-hosted Sentry instance with Docker for a project that spanned several years. However, this experience might be somewhat outdated, as it was a few years ago.
- error happens, can be attributed to release 1.2.3
- every subsequent time that error happens to a different user, it can track who was affected by it, without opening a new error report
- your project can opt-in to accepting end-user feedback on error: "please tell us what you were doing when this exploded, or feel free to rant and rave, we read them all"
- it knows from the stack trace that the error is in src/kaboom/onoz.py line 55
- onoz.py:55 was last changed by claude@example.com last week, in PR #666
- sentry can comment upon said PR to advise the reviewers of the bad outcome
- sentry can create a Jira with the relevant details
- claude.manager@example.com can mark the bug as "fixed in the next release", which will cause sentry to suppress chirping about it until it sees a release 1.2.4
- if it happens again it will re-open the prior error report, marking it as a regression
Unless you know something I don't, Grafana does *ABSOLUTELY NONE* of that
In my Django app I wrote a logging handler that stores the log records (including traceback) in a database table. I can inspect the log records through Django admin, and a cron job sends me daily emails saying "X new log records in the last 24 hours" so I know to check them out. And that's it :-)
Of course, this does a lot less than Sentry, and has various limitations (e.g. what if the error is about the database being down...), but it fits my needs.
BTW, IIUC, Sentry in its early beginnings was also doing just that – logging to database: https://github.com/dcramer/django-db-log
FWIW, we've been self-hosting 2 instances (and purchase a third hosted instance), for, it looks like 8 years now, and it's had a few small bumps but hasn't been too bad. Our instances are running with 20GB of RAM, so pretty small. ~90% of updates have gone smoothly, which leaves 10% of updates that have had some sort of problem. Solutions have been found in the issue tracker in all our cases. We are a little afraid of doing updates, but do them every couple of months.
Sentry is an amazing piece of software and it is fantastic that they offer a self-hosing version, and all things considered it is a fairly easy self-host.
I "gave up" from the perspective of returning to Sentry after a couple of years, and finding an entirely different beast from the tool I loved before. At that point I indeed didn't make it past Sentry's own FUD.
For example, Firefox tries to connect
---
IMHO, the RUM tooling[1] is the worst offender, tracking mouse movement and keystrokes. At least Sentry is usually just trying to report JS kabooms
1: https://docs.datadoghq.com/real_user_monitoring/session_repl... and https://docs.newrelic.com/docs/browser/browser-monitoring/br... et al
I respect the fact that the Sentry people are honest about their teetering tower.
maybe they should put in a system to monitor the docker containers.
It's relatively new and did take some tinkering to make it work properly, so I wrote a short article about it: https://weberdominik.com/blog/self-host-hyperdx
But the feature set and user experience is great!
vanschelven•1d ago
When I posted this myself on Reddit, I said the following:
I've long held off on actually posting this article to a platform like this one (don't bash your competition and all that), but "isn't Sentry self-hosted?" _is_ one of the most asked questions I get, and multiple people have told me this blog-post actually explains the rationale for Bugsink better than the rest of the site, so there you have it.
yarekt•1d ago
Feedback on competition bashing: sometimes they deserve it, they should really just come out and say it: “open sourcing our stuff isn’t working for us, we want to keep making money on the hosting”, and that would be ok
zeeg•15h ago
https://blog.sentry.io/building-an-open-source-service/
We enable self-hosting because not everyone can use a cloud service (e.g. government regulation), otherwise we probably wouldn't even spend energy on it. We dont commercialize it at all, and likely never will. I strongly believe people should not run many systems themselves, and something that monitors your reliability is one such system. The lesson you learn building a venture backed company, and one that most folks miss: focus on growth, not cost-cutting. Self-hosting for many is a form of cost-cutting.
We do invest in making it easier, and its 100% a valid complaint that the entire thing is awful today to self-host, and most people dont need a lot of the functionality we ship. Its not intentional by any means, its just really hard to enable a tiny-scale use-case while also enabling someone like Disney Plus.
miyuru•19h ago
vanschelven•16h ago
In fact I did one last week, but it got only a fraction of today's article's traction... I'll try again in whatever the prescribed interval is :-)
apitman•15h ago