The title should have been: "how a single line of code cost our users probably more than $8000"
Just amazed. Yea ‘write code carefully’ as if suggesting that’ll fix it is a rookie mistake.
So so frustrating when developers treat user machines like their test bed!
Avoidable, unfortunate, but the cost of slowing down development progress e.g. 10% is much higher.
But agree that senior gatekeepers should know by heart some places where review needs to be extra careful. Like security pitfalls, exponential fallback of error handling, and yeah, probably this.
Although, after such a fuck up, I would be tempted to make a pre-release check that tests the compiled binary, not any unit test or whatever. Use LD_PRELOAD to hook the system timing functions(a quick google shows that libfaketime[0] exists, but I've never used it), launch the real program and speed up time to make sure it doesn't try to download more than once.
After I shipped a bug the Director of Engineering told me I should "test better" (by clicking around the app). This was about 1 step away from "just don't write bugs" IMO.
TBH, that was well done for what it was but really called for automation and lacked unit-testing.
I don't get the impression they did any testing at all.
Good on them. Most companies would cap their responsibility at a refund of their own service's fees, which is understandable as you can't really predict costs incurred by those using your service, but this is going above and beyond and it's great to see.
How the times have changed ..
Screen Studio can collect basic usage data to help us improve the app, but you can opt out of it during the first launch. You can also opt out at any time in the app settings.
1) Emergency update for remote exploit fixes only
2) Regular updates
The emergency update can show a popup, but only once. It should explain the security risk. But allow user to decline, as you should never interrupt work in progress. After decline leave an always visible small warning banner in the app until approved.
The regular update should never popup, only show a very mild update reminder that is NOT always visible, instead behind a menu that is frequently used. Do not show notification badges, they frustrate people with inbox type 0 condition.
This is the most user friendly way of suggesting manual updates.
You have to understand, if user has 30 pieces of software, they have to update every day of the month. That is not a good overall user experience.
That's not an user issue tho, it's a "packaging and distribution of updates" issue which coincidentally has been solved for other OS:es using a package manager.
If the update interval had been 1 day+, they probably wouldn't have noticed after one month when they had a 5 minute update check interval.
This could have easily been avoided by prompting the user for an update, not silently downloading it in the background... over and over.
Their users do not care about their screen recording studio anywhere near as much as the devs who wrote it do.
Once a month is probably plenty.
Personally, I disable auto-update on everything wherever possible, because the likelihood of annoying changes is much greater than welcome changes for almost all software I use, in my experience.
It is sort of fun (for $8,000) as it was “just” a screenshotter, but imagine this with bank app or any other heavily installed app.
All cloud providers should have alerts for excessive use of network by default. And they should ask developers if they really want to turn alerts off.
I remember Mapbox app that cost much more, just because provider did charge by months… and it was a great dispute who’s fault it was…
What might be fun is figuring out all the ways this bug could have been avoided.
Another way to avoid this problem would have been using a form of “content addressable storage”. For those who are new, this is just a fancy way of saying make sure to store/distribute the hash (ex. Sha256) of what you’re distributing and store it on disk in a way that content can be effectively deduplicated by name.
It’s probably not so easy as to make it a rule, but most of the time, an update download should probably do this
The most obvious one is setting up billing alerts.
Past a certain level of complexity, you're often better off focusing on mitigation that trying to avoid every instance of a certain kind of error.
The number of times I have caught junior or even experienced devs writing potential PII leaks is absolutely wild. It's just crazy easy in most systems to open yourself up to potential legal issues.
The website makes it seem like it's a one person shop.
If you're not confident you can review a piece of code you wrote and spot a potentially disastrous bug like the one in OP, write more tests.
The context it makes the most sense is accepting code from strangers in a low trust environment.
The alternative to trying to prevent mistakes is making it easy to find and correct them. Run CI on code after it’s been merged and send out emails if it’s failed. At the end of a day produce a summary of changes and review them asynchronously. Use QA, test environments, etc.
The author seemed to enjoy calculating the massive bandwidth numbers, but didn’t stop to question whether 5 minutes was a totally ridiculous.
Previous discussion: https://news.ycombinator.com/item?id=35858778
Electron really messed up a few things in this world
https://en.m.wikipedia.org/wiki/Knight_Capital_Group#2012_st...
440m usd
The url specifically asks Wikipedia to serve the mobile site.
I’m sorry but it’s exactly cases like these that should be covered by some kind of test, especially When diving into a refactor. Admittedly it’s nice to hear people share their mistakes and horror stories, I would get some stick for this at work.
Curious where the high-water mark is across all HNers (:
Our team had a bug that cost us about $120k over a week.
Another bug running on a large system had an unmeasurable cost. (Could $K, could be $M)
The relevance is that instead of checking for a change every 5 minutes, the delay wasn't working at all, so the check ran as fast as possible in a tight loop. This was between a server and a blob storage account, so there was no network bottleneck to slow things down either.
It turns out that if you read a few megabytes 1,000 times per second all day, every day, those fractions of a cent per request are going to add up!
I understand the reasoning, but that makes it feel a bit too close to a C&C server for my liking. If the update server ever gets compromised, I imagine this could increase the damage done drastically.
No, it doesn't mean that.
Auto updater introduced series of bad outcomes.
- Downloading update without consent, causing traffic for client.
- Not only that, the download keeps repeating itself every 5 minutes? You did at least detect whether user is on metered connection, right... ?
- A bug where update popup interrupts flow
- A popup is a bad thing on itself you do to your users. I think it is OK when closing the app and let the rest be done in background.
- Some people actually pay attention to outgoing connections apps make and even a simple update check every 5 minutes is excessive. Why even do it while app is running? Do on startup and ask on close. Again some complexity: Assume you're not on network, do it in background and don't bother retrying much.
- Additional complexity for app that caused all of the above. And it came with a price tag to developer.
Wouldn't app store be perfect way to handle updates in this case to offload the complexity there?
That was a thing I thought was missing from this writeup. Ideally you only roll up the update to a small percent of users. You then check to see if anything broke (no idea how long to wait, 1 day?). Then you increase the percent a little more (say, 1% to 5%) and wait a day again and check. Finally you update everyone (who has updates on)
Thinking of it, the discussed do-it-yourself update checking is so stupid that malice and/or other serious bugs should be assumed.
Screen Studio has 32k followers, lets say 6% are end users, 2000 users at $229, that is $137k in App Store fees.
I am going to say writing your own app update script is a wash time wise, as getting your app published is not trivial, especially for an app that requires as many permissions as screen studio.
If you’re a small shop or solo dev, it is real hard to justify going native on three platforms when electron gives it for (near) free. And outside of HN, no one seems to blink at a 250MB bundle.
There are alternatives like Tauri that use the system browser and allow substantially smaller bundles, but they’re not nearly as mature as Electron, and you will get cross platform UI bugs (some of which vary by user’s OS version!) from the lack of standardization.
I’d actually seen this project before because the author did a nice write up on using React portal to portal into electron windows[1], which is something I decided to do in my app.
I’d just assumed his was a cross platform project.
1: https://pietrasiak.com/creating-multi-window-electron-apps-u...
But then the HN crowd would complain "why use an app store? that's gate keeping, apple could remove your app any day, just give me a download link, and so on..."
You literally can't win.
I think that is the essence of what is wrong with the cloud costs. Defaulting to possibility for everyone to scale rapidly while in reality 99% have quite predictable costs month over month.
For those interested in this topic, and how other industries (e.g. Airline industry) deal with learning from or preventing failure: Sidney Dekker is the authority in this domain. Things like Restorative Just Culture, or Field guide to understanding human error could one day apply to our industry as well: https://sidneydekker.com/books.
Seriously this alone makes me question everything about this app.
> Write your auto-updater code very carefully.
You have to be soooo careful with this stuff. Especially because your auto-updater code can brick your auto-updater.
It looks like they didn't do any testing of their auto update code at all, otherwise they would have caught it immediately.
At some scale such careless mistakes are going to create real effects for all users of internet through congestion as well.
If this was not a $8000 mistake but was somehow covered by a free tier or other plan from Google Cloud, would they still have considered it a serious bug and fixed it as promptly?
How many such poor designs are out there generating traffic and draining common resources.
Once a day would surely be sufficient.
Weekly or monthly would be sufficient. I'd also like "able to be disabled manually, permanently" as an option, too.
Good way of showing adoption and growth.
Nobody under any circumstances needs usage stats with 5 minute resolution. And certainly not a screen recorder.
Data centers are big and scary, no body wanted to run their own. The hypothetical cost savings of firing half the IT department was too good to pass up.
AWS even offered some credits to get started, first hit's free.
Next thing you know your AWS spend is out if control. It just keeps growing and growing and growing. Instead of writing better software, which might slow down development, just spend more money.
Ultimately in most cases it's cheaper in the short term to give AWS more money.
Apart of me wants to do a 5$ VPS challenge. How many users can you serve with 5$ per month. Maybe you actually need to understand what your server is doing ?
I'm talking non sense, I know.
I lost it
It's just tricky, basically one fat edge case, and a critical part of your recovery plan in case of serious bugs in your app.
(This bug isn't the only problem with their home-grown updater. Checking every 5 min is just insane. Kinda tells me they aren't thinking much about it.)
You can use whatever you want outside of the App Store - most will use Sparkle to handle updates https://sparkle-project.org/. I presume Windows is similar.
Especially for a Mac-only application where Sparkle (https://sparkle-project.org/) has been around for almost two decades now and has been widely used across all sorts of projects to the point that it's a de facto standard. I'd be willing to bet that almost every single Mac "power user" on the planet has at least one application using Sparkle installed and most have a few.
$229 per year on a closed source product and this is the level of quality you can expect.
You can have all the respect for users in the world, but if you write downright hazardous code then you're only doing them a disservice. What happened to all the metered internet plans you blasted for 3 months? Are you going to make those users whole?
Learning from and owning your mistake is great and all, but you shouldn't be proud or gloating about this in any way, shape, or form. It is a very awkward and disrespectful flex on your customers.
Well, you should hire contractor to set console for you.
"Designed for MacOS", aah don't worry, you will have the money from apes back in the no time. :)
In the grand scheme of things, $8k is not much money for a business, right? Like we can be pretty sure nobody at Google said “a-ha, if we don’t notify the users, we will be able sneak $8k out of their wallets at a time.” I think it is more likely that they don’t really care that much about this market, other than generally creating an environment where their products are well known.
If the file contains invalid JS (syntax error, or too new features for IE on Win7/8), or if it's >1MB (Chromium-based browsers & Electron limit), and the file is configured system-wide, then EVERY APP which uses wininet starts flooding the server with the requests over and over almost in an endless loop (missing/short error caching).
Over the years, this resulted in DDoSing my own server and blackholing its IP on BGP level (happened 10+ times), and after switching to public IPFS gateways to serve the files, Pinata IPFS gateway has blocked entire country, on IPFS.io gateway the files were in top #2 requests for weeks (impacting operational budget of the gateway).
All of the above happens with tight per-IP per-minute request limits and other measures to conserve the bandwidth. It's used by 500 000+ users daily. My web server is a $20/mo VPS with unmetered traffic, and thanks to this, I was never in the situation as the OP :)
screen.studio is macOS screen recording software that checks for updates every five minutes. Somehow, that alone is NOT the bug described in this post. The /other/ bug described in this blog is: their software also downloaded a 250MB update file every five minutes.
The software developers there consider all of this normal except the actual download, which cost them $8000 in bandwidth fees.
To re-cap: Screen recording software. Checks for updates every five (5) minutes. That's 12 times an hour.
I choose software based on how much I trust the judgement of the developers. Please consider if this feels like reasonable judgement to you.
How on earth is a screen recording app 250 megabytes
I work with developers in SCA/SBOM and there are countless devs that seem to work by #include 'everything'. You see crap where they include a misspelled package name and then they fix it by including the right package but not removing the wrong one!.
I use Screen Studio, it’s awesome, and I’ll keep using it.
On one hand it's good that the author owns up to it, and they worked with their users to provide remedies. But so many things aren't adding up. Why does your screen recorder need to check for updates every 5 minutes? Once a day is more than enough.
This screams "We don't do QA, we shorts just ship"
What's really scary here is the lack of consent. If I want to record videos I don't necessarily have an extra 250mb to spend( many users effectively pay by the gig) everytime the developer feels like updating.
Looking at the summary section, I'm not convinced these guys learned the right lesson yet.
Nothing has been learned in this post and it has costed him $8,000 because of inadequate testing.
This is still bad. I was really hoping the bug would have been something like "I put a 5 minute check in for devs to be able to wait and check and test a periodic update check, and forgot to revert it". That's what I expected, really.
We used Sparkle, https://sparkle-project.org/, to do our updates. IMO, it was a poor choice to "roll their own" updater.
Our application was very complicated and shipped with Mono... And it was only about ~10MB. The Windows version of our application was ~2MB and included both 32-bit and 64-bit binaries. WTF are they doing shipping a 250MB screen recorder?
So, IMO, they didn't learn their lesson. The whole article makes them look foolish.
nikanj•9h ago
ant6n•8h ago
nikanj•5h ago