Get yourself on public suffix list or get better moderation. But of course just moaning about bad google is easier.
The last point is actually the one I'm trying to make.
From spamblocking that builds heuristics fed by the spam people manually flag in GMail to Safe Browsing using attacks on users' Chrome as a signal to their voice recognition engine leapfrogging the industry standard a few years back because they trained it on the low-quality signal from GOOG411 calls, Google keeps building product by harvesting user data... And users keep signing up because the resulting product is good.
This puts a lot of power in their hands but I don't think it's default bad... If it becomes bad, users leave and Google starts to lose their quality signal, so they're heavily incentivized to provide features users want to retain them.
This does make it hard to compete with them. In the US at least, antitrust competition has generally been about user harm, not actually market harm. If a company has de-facto control but customers aren't getting screwed, that's fine because ultimately the customer matters (and nobody else is owed a shot at beeing a Google).
Folks around here are generally uneasy about tracking in general too, but remove big brother monitoring from Safe Browsing and this story could still be the same: whole domain blacklisted by Google, only due to manual reporting instead.
"Oh, but a human reviewer would've known `*.statichost.eu` isn't managed by us"—not in a lot of cases, not really.
There is no real way a normal person even can flag facebook.
On the whole of YouTube, it's a tiny sliver of a percentage, but because YouTube has grown too large to moderate, it's still hosting these videos.
If Google applied the same rules they apply to the safe browsing list, they'd probably get YouTube flagged multiple times a week.
In my experience, safe browsing does theoretically allow you to report scams and phishing in terms of user generated content, but it won't apply unless there's an actual interactive web page on the other end of the link.
There is the occasional false positive but many good sites that end up on that list are there because their WordPress plugin got hacked and somewhere on their site they are actually hosting malware.
I've contacted the owners of hacked websites hosting phishing and malware content several times, and most of the time I've been accused of being the actual hacker or I've been told that I'm lying. I've given up trying to be the good guy and report the websites to Google and Microsoft these days to protect the innocent.
Google's lack of transparency what exact URLs are hosting bad material does play a role there.
On top of that, it is also recommended to serve user content from another domain for security reasons. It's much easier to avoid entire classes of exploits this way. For the site admins: treat it as a learning experience instead of lashing out on goog. In the long run you'll be better off, having learned a good lesson.
XKCD 1053 is not a valid excuse for what amounts to negligence in a production service.
So good, in fact, that it should have been known to an infrastructure provider in the first place. There's a lot of vitriol here that is ultimately misplaced away from the author's own ignorance.
My comment about vitriol was more directed at the HN commenters than Eric himself. Really, I think a discussion about web infrastructure is more interesting than a hatefest on Google. Thankfully, the balance seems to have shifted since I posted my top-level comment.
I suspect the author is unaware of their other blindspots. It's not 2001 anymore. Holding yourself out as a hosting provider comes with some baseline expectations.
Do you have more details? That sounds interesting.
The PSL is one of those load-bearing pieces of web infrastructure that is esoteric and thanklessly maintained. Maybe there ought to be a better way, both in the sense of a direct alternative (like DNS), and in the sense of a better security model.
For what it's worth, this makes it sound like you think the vitriol should be aimed at the author's ignorance rather than the circumstances which led to it, presuming you meant the latter.
However, I'm now reflecting on what I said as "be careful what you wish for", because the comments on this HN post have done a complete 180 since I wrote it, to the point of turning into a pile-on in the opposite direction.
Well, this is a problem that caused the author's ignorance but you present it as though it's the other way around. That's all I meant. Not disagreeing with "should have known better" either.
I checked it for two popular public suffixes that came to mind: 'livejournal.com' and 'substack.com'. Both weren't there.
Maybe I'm mistaken, it's not a bug and these suffixes shouldn't be included, but I can't think of the reason why.
User-uploaded content (which does pose a risk) is all hosted on substackcdn.com.
The PSL is more for "anyone can host anything in a subdomain of any domain on this list" rather than "this domain contains user-generated content". If you're allowing people to host raw HTML and JS then the PSL is the right place to go, but if you're just offering a user post/comment section feature, you're probably better off getting an early alert if someone has managed to breach your security and hacked your system into hosting phishing.
It's a weird thing, to be honest, a Github repo mentioned nowhere in any standards that browsers use to treat some subdomains differently.
Information like this doesn't just manifest itself into your brain once you start hosting stuff, and if I hadn't known about its existence I wouldn't have thought to look for a project like this either. I certainly wouldn't have expected it to be both open for everyone and built into every modern internet-capable computer or anti malware service.
> All sites on statichost.eu get a SITE-NAME.statichost.eu domain, and during the weekend there was an influx of phishing sites.
Second, they should be using the public suffix list (https://publicsuffix.org/) to avoid having their entire domain tagged. How else is Google supposed to know that subdomains belong to different users? That's what the PSL is for.
From my reading, Safe Browsing did its job correctly in this case, and they restored the site quickly once the threat was removed.
The new separate domain is pending inclusion in the PSL, yes.
Edit: the "effort" I'm talking about above refers to more real time moderation of content.
> Second, they should be using the public suffix list (https://publicsuffix.org/) to avoid having their entire domain tagged.
NO, Google should be "mindful" (I know companies are not people but w/e) of the power it unfortunately has. Also, Cloudflare. All my homies hate Cloudflare.
... by using the agreed-upon tool to track domains that treat themselves as TLDs for third-party content: the public suffix list. Microsoft Edge and Firefox also use the PSL and their mechanisms for protecting users would be similarly suspicious that attacks originating from statichost.eu were originating from the owners of that domain and not some third-party that happened to independently control foo.statichost.eu.
That is probably true, but in this case I think most people would think that they used that power for good.
It was inconvenient for you and the legitimate parts of what was hosted on your domain, but it was blocking genuinely phishing content that was also hosted on your domain.
In the social, there is always someone with most of the power (distributed power is an unstable equilibrium), and it's incumbent upon us, the web developers, to know the current status quo.
Back in the day, if you weren't testing on IE6 you weren't serving a critical mass of your potential users. Nowadays, the nameplates have changed but the same principles hold.
This safety feature saves a nontrivial number of people from life-changing mistakes. Yes we publishers have to take extra care. Hard to see a negative here.
Helping people avoid potentially devastating mistakes is of course a good thing.
There are required steps to follow but none are "have x users" or "see a lot of spam". It's mostly "follow proper DNS steps and guidelines in the given format" with a little "show you're doing this for the intended reason rather than to circumvent something the PSL is not meant for/for something the public can't get to anyways" (e.g. tricking rate limits, internal only or single user personal sites) added on top.
"Projects that are smaller in scale or are temporary or seasonal in nature will likely be declined. Examples of this might be private-use, sandbox, test, lab, beta, or other exploratory nature changes or requests. It should be expected that despite whatever site or service referred a requestor to seek addition of their domain(s) to the list, projects not serving more then thousands of users are quite likely to be declined."
Maybe the rules have changed, or maybe you were lucky? :)
I think most people working in tech know the extent to which Google can screw over a business when they make a mistake, but the gravity of the situation becomes much clearer when it actually happens to you.
This time it's a phishing website, but what if the same happens five years down the line because of an unflattering page about a megalomaniac US politician?
The public suffix list (https://publicsuffix.org/) is good and if I were to start from scratch I would do it that way (with a different root domain) but it's not absolutely required, the search engines can and do make exceptions that don't just exclusively use the PSL, but you'll hit a few bumps in the road before that gets established.
Ultimately Google needs to have a search engine that isn't full of crap, so moving user content to a root domain on the PSL that is infested with phishing attacks isn't going to save you. You need to do prolific and active moderation to root out this activity or you'll just be right back on their shit list. Google could certainly improve this process by providing better tooling (a safe browsing report/response API would be extremely helpful) but ultimately the burdon is on platforms to weed out malicious activity and prevent it from happening, and it's a 24/7 job.
BTW the PSL is a great example of the XKCD "one critical person doing thankless unpaid work" comic, unless that has changed in recent years. I am a strong advocate of having the PSL management become an annual fee driven structure (https://groups.google.com/g/publicsuffix-discuss/c/xJZHBlyqq...), the maintainer deserves compensation for his work and requiring the fee will allow the many abandoned domains on the list to drop off of it.
Strict cookies crossing root to subdomains would be a major security bug in browsers. It's always been a (valid) theoretical concern but it's never happened on a large scale to the point I've had to address it. There is likely regression testing on all the major browsers that will catch a situation where this happens.
Anyone who can upload HTML pages to subdomain.domain.com can read and write cookies for *.domain.com, unless you declare yourself a public suffix and enough time has passed for all the major browsers to have updated themselves.
I've seen web hosts in the wild who could have their control panel sessions trivially stolen by any customer site. Reported the problem to two different companies. One responded fairly quickly, but the other one took several years to take any action. They eventually moved customers to a separate domain, so the control panel is now safe. But customers can still execute session fixation attacks against one another.
> Static site hosting you can trust
is more like amateur hour static site hosting you can’t trust. Sorry.
I'm also trusting my users to not expose their cookies for the whole *.statichost.eu domain. And all "production" sites use a custom domain anyway, which avoids all of this anyway.
Post author is throwing a lot of sand at Google for a process that has (a) been around for, what, over a decade now and (b) works. The fact of the matter is this hosting provider was too open, several users of the provider used it to put up content intended to attack users, and as far as Google (or anyone else on the web is concerned) the TLD is where the buck stops for that kind of behavior. This is one of the reasons why you host user-generated content off your TLD, and several providers have gotten the memo; it is unfortunate statichost.eu had not yet.
I'm sorry this domain admin had to learn an industry lesson the hard way, but at least they won't forget it.
The bigger issue is that the internet needs governance. And, in the absence of regulation, someone has stepped in and done it in a way that the author didn't like.
Perhaps we could start by requiring that Google provide ways to contact a living, breathing human. (Not an AI bot that they claim is equivalent.)
Hopefully, this helps you understand why your living, breathing human is such a farcical idea for theGoogs to consider.
So you can't take one part of the responsibility and abdicate the other part!
As a result, some ISPs apparently block the domain. Why is it listed? I have no idea. There are no ads, there is no user content, and I've never sent any email from the domain. I've tried contacting spamhaus, but they instantly closed the ticket with a nonsensical response to "contact my IT department" and then blocked further communication. (Oddly enough, my personal blog does not have an IT department.)
Just like it's slowly become quasi-impossible for an individual to host their own email, I fear the same may happen with independent websites.
Either that or your DNS provider hosts a lot of spam.
However, I think the issue is that with great power comes great responsibility.
They are better than most organisations, and working with many constraints that we cannot always imagine.
But several times a week we get a false "this mail is phishing" incident, where a mail from a customer or prospect is put in "Spam", with a red security banner saying it contains "dangerous links". Generally it is caused by domain reputation issues, that block all mail that uses an e-mail scanning product. These products wrap URLs so they can scan when the mail is read, and thus when they do not detect a virus, they become defacto purveyors of virii, and their entire domain is tagged as dangerous.
I have raised this to Google in May (!) and have been exchanging mail on a nearly daily basis. Pointing out a new security product that has been blacklisted, explaining the situation to a new agent, etc.
Not only does this mean that they are training our staff that security warnings are generally false, but it means we are missing important mail from prospects and customers. Our customers are generally huge corporations, missing a mail for us is not like missing one mail for a B2C outfit.
So far the issue is not resolved (we are in Oct now!) and recently they have stopped responding. I appreciate our organisation is not the US Government, but still, we pay upwards of 20K$ / year for "Google Workspace Enterprise" accounts. I guess I was expecting something more.
If someone within Google reads this: you need to fix this.
Half (or more) of security alerts/warnings are false positives. Whether it's the vulnerability scanner complaining about some non-existent issues (based on the version of Apache alone... which was back ported by the package maintaner), or an AI report generated by interns at Deloitte fresh out of college, or someone reporting www.example.com to Google Safe Browsing as malicious, etc. At least half of the things they report on are wrong.
You sort of have to have a clue (technically) and know what you are doing to weed through all the bullshit. Tools that block access, based on these things do more harm than good.
Search Console always points to my internal login page, which isn’t public and definitely isn’t phishing.
They clear it quickly when I appeal, and since it’s just for me, I’ve mostly stopped worrying about it.
It basically goes: growing user base -> growing amount of malicious content -> ability to submit domain to PSL. In that order, more or less.
In terms of security, for me, there's no issue with being on the same domain as my users. My cookies are scoped to my own subdomain, and HTTPS only. For me, being blocked was the only problem, one that I can honestly admit was way bigger than I thought.
Hence, the PSA. :)
duxup•3h ago
jacquesm•3h ago
blenderob•2h ago
Imustaskforhelp•2h ago
I have mixed opinions about discord and if I can be honest, I have mixed opinions about forums as well
My opinion is to take things like forums and transfer them over to things like xmpp/(Irc?)/(signal?)/(matrix most prefered)
There are bridges as well for matrix <-> Irc if this is something that interests you, there are bridges for everything but I prefer matrix with cinny and I generally think that due to its decentralized nature, it might be better than centralized forums maybe as well.
morkalork•2h ago
pixl97•2h ago
Almost none, but it's due to a lot of complicated factors and not just the direct risk of user content.
Take moderation of content that won't get you banned by your ISP. It sucks. Nobody in their right mind would want to do it. There are countless bots and trolls that are going to flood your forums for whatever cause they champion.
Then there is DDOS floods because you pissed off said bots and trolls. This can make the forums unaffordable and piss off your ISP.
But even if nothing goes wrong, popularity is a risk in itself. In the past there was stuff like the Slashdot effect where your site would go down for a while. But now if your small site became popular on tiktok for some reason 20 million people could show up. Even if your site can stand up to that, how will you moderate it? How will you pay for the bandwidth?
Oh, and will you get any advertisers because of said user content? How are you going to pay for the site?
Oh, also you're competing with massive sites for eyeballs, how are you going to get actual users?
wahnfrieden•2h ago
jacquesm•2h ago
dylan604•2h ago
jacquesm•1h ago