It turns out OpenRouter’s API is protected by Cloudflare and something about specific raw chunks of HTML and JavaScript in the POST request body cause it to block many, though not all, requests. Going direct to OpenAI or Anthropic with the same prompts is fine. I wouldn’t mind but these are billable requests to commercial models and not OpenRouter’s free models (which I expect to be heavily protected from abuse).
(Update: On the way to doing that, I decided to run my tests again and they now work without Cloudflare being touchy, so I'll keep an eye on it!)
(Update 2: They just replied to me on X and said they had fixed their Cloudflare config - happy days!)
So is HN and every other site in the world insecure because it allows users to post "/etc/hosts" ?
The part that’s not said outloud is that a lot of “computer security” people aren’t concerned with understanding the system. If they were, they’d be engineers. They’re trying to secure it without understanding it.
https://knowyourmeme.com/memes/events/star-wars-battlefront-...
I believe the exact opposite.
One (of many) reasons is that it can make your code less secure, by hiding your security mistakes from you.
If your WAF obscures escaping issues during your own testing and usage you could very easily let those escaping issues go unresolved - leaving you vulnerable to any creative attacker who can outsmart your WAF.
Why? It obviously has an annoying cost and equally obviously won't stop any hacker with a lukewarm IQ
The problem is that generally you're breaking actual valid use cases as the tradeoff to being another layer of defense against hypothetical vulnerabilities.
Yes, discussing the hosts file is a valid use case.
Yes putting angle brackets in the title of your message is valid use case your users are going to want.
Yes putting "mismatched" single quotes inside double quotes is a thing users will do.
Yes your users are going to use backslashes and omit spaces in a way that looks like attempts at escaping characters.
(All real problems I've seen caused by overzealous security products)
It is net zero for security and net negative for user experience, so having it is worse than not having it.
The way I assume it works in practice on a real team is that after some time, most of your team will have no idea how the WAF works and what it protects against, where and how it is configured… but they know it exists, so they will no longer pay attention to security because “we have a tool for that”, especially when they should have finished that feature a week ago…
If you push back you'll always get a lecture on "defense in depth", and then they really look at you like you're crazy when you suggest that it's more effective to get up, tap your desk once, and spin around in a circle three times every Thursday morning. I don't know... I do this every Thursday and I've never been hacked. Defense in depth, right? It can't hurt...
the main point is you need to pay a third party
“We need SQL injection rules in the WAF”
“But we don’t have an SQL database”
“But we need to protect against the possibility of partnering with another company that needs to use the same datasets and wants to import them into a SQL database”
In fairness, these people are just trying to do their job too. They get told by NIST (et al) and Cloud service providers that WAF is best practice. So it’s no wonder they’d trust these snake oil salesman over the developers who asking not to do something “security” related.
It's like not allowing the filesystem to use the word "virus" in a file name. Yes, it technically protects against some viruses, but it's really not very difficult to avoid while being a significant problem to a fair number of users with a legitimate use case.
It's not that it's useless. It's that it's stupid.
It reminds me of when airports started scanning people's shoes because an attacker had used a shoe bomb. Yes, that'll stop an attacker trying a shoe bomb again, but it disadvantages every traveller and attackers know to put explosives elsewhere.
It’s even dumber than that. An attacker tried and failed to use a shoe bomb, and yet his failure has caused untold hours of useless delay for over 13 years now.
Anything else is just a fuzzy bug injector that will only stop the simplest scanners and script kiddies if you are lucky.
Resolving wildcards is trickier but definitely possible if you have a list of forbidden files
[1]: https://nodejs.org/api/path.html#pathresolvepaths
Edit: changed link because C's realpath has a slightly different behavior
Be very, very careful about this, because if you aren't, this can actually result in platform-dependent behavior or actual filesystem access. They are bytes containing funny slashes and dots, so process them as such.
Edit: s/text/bytes/
The outcome is the usual one, stuff breaks and there is no additional security.
Another developer in the team decided they wanted to log what customers searched for, so if someone typed in "OutOfMemoryException" in the search bar...
Error: OutOfMemoryException
And Search: OutOfMemoryException
Should not be related in any wayi guess demanding "Structured logs for everything or bust" is the answer? (i'm not a big o11y guy so pardon me if this is obvious)
It all comes down to understanding whether the intersection of two grammars is empty.
Sure you could add a convention to your 'how to log' doc that specifies that all user input should be tagged with double '#' but who reads docs until things break? convention is a shitty way to make things work.
There's 100 ways that you could make this work correctly. Only restarting on a much more specific string, i.e. including the app name in the log line etc . . . but that's all just reducing the likely hood that you get burned.
I've also written a OOM-Killer.sh myself, I'm not above that, but it's one of those edge cases that's impossible to do correctly, which is why parsing and acting on log data generally considered and anti-pattern.
Numeronyms are evil and we should stop using them.
counter point - people are going to use them, better to expose newbies early and often and then everyone is better off
shorthands will always be in demand. we used to say “horseless carriage”, then “automobile”, then “car”. would you rather use Light amplification by stimulated emission of radiation or just “laser”s? etc
in the new york times? sure, spell out observability. but on HN? come on. the term is 7 years old and is used all over the site. it’s earned it
It's not likely to be a WAF or content scanner, because the HTTP request is using PUT (which browser forms don't use) and it's uploading the content as a JSON content-type in a JSON document. The WAF would have to specifically look for PUTs, open up the JSON document, parse it, find the sub-string in a valid string, and reject it. OR it would have to filter raw characters regardless of the HTTP operation.
Neither of those seem likely. WAFs are designed to filter on specific kinds of requests, content, and methods. A valid string in a valid JSON document uploaded by JavaScript using a JSON content-type is not an attack vector. And this problem is definitely not path traversal protection, because that is only triggered when the string is in the URL, not some random part of the content body.
Nobody could prove that's exactly what's happening without seeing Cloudflare's internal WAF rules, but can you think of any other reasonable explanation? The endpoint is rejecting a PUT who's payload contains exactly /etc/hosts, /etc/passwd, or /etc/ssh/sshd_config, but NOT /etc/password, /etc/ssh, or /etc/h0sts. What else could it be?
It references this CVE https://github.com/tuo4n8/CVE-2023-22047 which allows the reading of system files. The example given shows them reading /etc/passwd
The author just collected a bunch of correlations and then decided what the cause was. I've been doing this kind of work for many, many years. Just because it looks like it's caused by one thing, doesn't mean it is.
Correlation is not causation. That's not just a pithy quip, there's a reason why it's important to actually find causation.
edit: Also, someone commented here "it was an irrelevant cf WAF rule, we disabled it". Assuming honesty, seems to confirm that the author was indeed right.
The same application also stored my full password in localStorage and a cookie (without httponly or secure). Because reasons. Sigh.
I'm going to do a hot take and say that WAFs are bollocks mainly used by garbage software. I'm not saying a good developer can't make a mistake and write a path traversal, but if you're really worried about that then there are better ways to prevent that than this approach which obviously is going to negatively impact users in weird and mysterious ways. It's like the naïve /(fuck|shit|...)/g-type "bad word filter". It shows a fundamental lack of care and/or competency.
Aside: is anyone still storing passwords in /etc/passwd? Storing the password in a different root-only file (/etc/shadow, /etc/master.passwd, etc.) has been a thing on every major system since the 90s AFAIK?
So it's really blocking doorknob-twisting scripts.
as far as WAFs being garbage, they absolutely are, but this is a great time for a POSIWID analysis. A WAF says its purpose is to secure web apps. It doesn't do that, but people keep buying them. Now we're faced with a crossroads: we either have to assume that everyone is stupid or that the actual purpose of a WAF is something other than its stated purpose. I personally only assume stupidity as a last resort. I find it lazy and cynical, and it's often used to dismiss things as hopeless when they're not actually hopeless. To just say "Oh well, people are dumb" is a thought-terminating cliche that ignores potential opportunities. So we do the other thing and actually take some time to think about who decides to put a WAF in-place and what value it adds for them. Once you do that, you see myriad benefits because a WAF is a cheap, quick solution that allows non-technical people to say they're doing something. You're the manager of a finance OU that has a development group in it whose responsibility is some small web app. Your boss just read an article about cyber security and wants to know what this group two levels below you is doing about cyber security. Would you rather come back with "We're gonna need a year, $1 million and every other dev priority to be pushed back in order to develop a custom solution" or "We can have one fired up tomorrow for $300/mo, it's developed and supported by Microsoft and it's basically industry standard." The negative impact of these things is obvious to us because this is what we do, but we're not always the decision-makers for stuff like that. Often the decision-makers are actually that naive and/or they're motivated less by the ostensible goal of better web app security and more by the goal of better job security.
As far as etc/passwd you're right that passwords don't live there anymore but user IDs often do and those can indicate which services are running as daemons on a given system. This is vital because if you can figure out what services are running you can start version fingerprinting them and then cross-referencing those versions with the CVE database.
Writing about biology, finance, or geology? Shrug.
Dumb filtering is bad enough when used by smart people with good intent.
For bonus, the reverse proxy will run on a system infiltrated by Russian (why not Chinese as well) hackers.
sits quietly for a second
"Oh nnnnnnnooooooooooooooo lol!"
Writing `find` as the first word in your search will prevent Firefox from accepting the “return” key is pressed.
Pretty annoying.
If your site discusses databases then turning on the default SQL injection attack prevention rules will break your site. And there is another ruleset for file inclusion where things like /etc/hosts and /etc/passwd get blocked.
I disagree with other posts here, it is partially a balance between security and usability. You never know what service was implemented with possible security exploits and being able to throw every WAF rule on top of your service does keep it more secure. Its just that those same rulesets are super annoying when you have a securely implemented service which needs to discuss technical concepts.
Fine tuning the rules is time consuming. You often have to just completely turn off the ruleset because when you try to keep the ruleset on and allow the use-case there are a ton of changes you need to get implemented (if its even possible). Page won't load because /etc/hosts was in a query param? Okay, now that you've fixed that, all the XHR included resources won't load because /etc/hosts is included in the referrer. Now that that's fixed things still won't work because some random JS analytics lib put the URL visited in a cookie, etc, etc... There is a temptation to just turn the rules off.
I favor the latter approach. That group of Cloudflare users will understand the complexity of their use case accepting SQL in payloads and will be well-positioned to modify the default rules. They will know exactly where they want to allow SQL usage.
From Cloudflare’s perspective, it is virtually impossible to reliably cover every conceivable valid use of SQL, and it is likely 99% of websites won’t host SQL content.
WAFs do throw false positives and do require adjustments OOTB for most sites, but you’re missing the forest by focusing on this single case.
And WAF rules can be tuned. There's no reason an apostrophe in a username or similar needs to be blocked, if it were by a rule.
If you know what you're doing, turn these protections off. If you don't, there's one less hole out there.
Ask the CIO what actual threat all this is preventing, and you'll get blank stares.
As an engineer what incentive is there to put effort into knowing where each form input goes and how to sanitize it in a way that makes sense? You are getting paid to check the box and move on, and every new hire quickly realizes that. Organizations like these aren't focused on improving security, they are focused on covering their ass after the breach happens.
the CIO is securing his job.
Every CIO I have worked for (where n=3) has gotten where they are because they're a good manager, even though they have near-zero current technical knowledge.
The fetishizing of "business," in part through MBAs, has been detrimental to actually getting things done.
A century ago, if someone asked you what you do and you replied, "I'm a businessman. I have a degree in business," you'd get a response somewhere between "Yeah, but what to you actually do" and outright laughter.
Finance and business grads have really taken over the economy, not just through technocratic "here's how to do stuff" advice but by personally taking all the reigns of power. They're even hard at work taking over medicine and pushing doctors out of the work-social upper-middle-class. Already did it with professors. Lawyers seem safe, so far.
Nope, lawyers are fucked too. It's just not as advanced yet: https://www.abajournal.com/web/article/arizona-approves-alte...
And economics. Many people here are blaming incompetent security teams and app developers, but a lot of seemingly dumb security policies are due to insurers. If an insurer says "we're going to jack up premiums by 20% unless you force employees to change their password once every 90 days", you can argue till you're blue in the face that it's bad practice, NIST changed its policy to recommend not regularly rotating passwords over a decade ago, etc., and be totally correct... but they're still going to jack up premiums if you don't do it. So you dejectedly sigh, implement a password expiration policy, and listen to grumbling employees who call you incompetent.
It's been a while since I've been through a process like this, but given how infamous log4shell became, it wouldn't surprise me if insurers are now also making it mandatory that common "hacking strings" like /etc/hosts, /etc/passwd, jndi:, and friends must be rejected by servers.
I wouldn't be mean about it. I'm imagining adding a line to the email such as:
> (Yes, I know this is annoying, but it's required by our insurance company.)
What is the insurance company going to do, jack up our rates because we accurately stated what their policy was?
It would've gone from the insurer to the legal team, to the GRC team, to the enterprise security team, to the IT engineering team, to the IT support team, and then to the user.
Steps #1 to #4 can (and do) introduce their own requirements, or interpret other requirements in novel ways, and you'd be #5 in the chain.
We're SOC2 + HIPAA compliant, which either means convincing the auditor that our in-house security rules cover 100% of the cases they care about... or we buy an off-the-shelf WAF that has already completed the compliance process, and call it a day. The CTO is going to pick the second option every time.
If your startup is on the verge of getting a 6 figure MRR deal with a company, but the company's security team mandates you put in a WAF to "protect their data"... guess you're putting in a WAF, like it or not.
Install the WAF crap, and then feed every request through rot13(). Everyone is happy!
If anything, I think this attitude is part of the problem. Management, IT security, insurers, governing bodies, they all just impose rules with (sometimes, too often) zero regard for consequences to anyone else. If no pushback mechanism exists against insurer requirements, something is broken.
If the insurer requested something unreasonable, you'd go to a different insurer. It's a competitive market after all. But most of the complaints about incompetent security practices boil down to minor nuisances in the grand scheme of things. Forced password changes once every 90 days is dumb and slightly annoying but doesn't significantly impact business operations. Having to run some "enterprise security tool" and go through every false positive result (of which there will be many) and provide an explanation as to why it's a false positive is incredibly annoying and doesn't help your security, but it's also something you could have a $50k/year security intern do. Turning on a WAF that happens to reject the 0.0001% of Substack articles which talk about /etc/hosts isn't going to materially change Substack's revenue this year.
The underlying purpose of the rules and agency to apply the spirt rather than the letter gets lost early in the chain and trying to unwind it can be tedious.
Information loss is an inherent property of large organizations.
I keep hearing that often on HN, however I've personally never seen seen such demands from insurers. I would greatly appreciate if one share such insurance policy. Insurance policies are not trade secrets and OK to be public. I can google plenty of commercial cars insurance policies for example.
https://retail.direct.zurich.ch/resources/definition/product...
Questionnaire Zurich Cyber Insurance
Question 4.2: "Do you have a technically enforced password policy that ensures use of strong passwords and that passwords are changed at least quarterly?"
Since this is an insurance questionnaire, presumably your answers to that question affect the rates you get charged?
(Found that with the help of o4-mini https://chatgpt.com/share/680bc054-77d8-8006-88a1-a6928ab99a...)
Totally bonkers stuff.
Eliminating everything but a business's industry specific apps, MS Office, and some well-known productivity tools slashes support calls (no customization!) and frustrates cyberattacks to some degree when you can't deploy custom executables.
It's about the transition from artisanal hand-configuration to mass-produced fleet standards, and diverting exceptional behavior and customizations somewhere else.
Alice is on Discord because half of the products the company uses now give more or less direct access to their devs through Discord
At first glance that might seem a poor move for corporate information security. But crucially, the security of cloud webapps is not the windows sysadmins' problem - buck successfully passed.
In around 2011, the Defence Signals Directorate (now the Australian Signals Directorate) went through and did an analysis of all of the intrusions they had assisted with over the previous few years. It turned out that app whitelisting, patching OS vulns, patching client applications (Office, Adobe Reader, browsers), and some basis permission management would have prevented something like 90% of them.
The "Top 4" was later expanded to the Essential Eight which includes additional elements such as backups, MFA, disabling Office macros and using hardened application configs.
https://www.cyber.gov.au/resources-business-and-government/e...
You install software via ticket requests to IT, and devs might have admin rights, but not root, and only temporary.
This is nothing new though, back in the timesharing days, where we would connect to the development server, we only got as much rights as required for the ongoing development workflows.
Hence why PCs felt so liberating.
Just wait when more countries keep adopting cybersecurity laws for companies liabilities when software doesn't behave, like in any other engineering industry.
Fear of a prospective expectation, compliance, requirement, etc., even when that requirement does not actually exist is so prevalent in the personality types of software developers.
My mental model at this point says that if there's a cost to some important improvement, the politics and incentives today are such that a typical executive will only do the bare minimum required by law or some equivalent force, and not a dollar more.
The worst part about cyber insurance, though, is that as soon as you declare an incident, your computers and cloud accounts now belong to the insurance company until they have their chosen people rummage through everything. Your restoration process is now going to run on their schedule. In other words, the reason the recovery from a crypto-locker attack takes three weeks is because of cyber insurance. And to be fair, they should only have to pay out once for a single incident, so their designated experts get to be careful and meticulous.
The long passphrase is more for the key that unlocks your password manager rather than the random passwords you use day to day.
This is such a bizarre hybrid policy, especially since forced password rotations at fixed intervals are already not recommended for end-user passwords as a security practice.
I believe that this is overall a reasonable approach for companies that are bigger than "the CEO knows everyone and trusted executives are also senior IT/Devs/tech experts" and smaller than "we can spin an internal security audit using in-house resources"
I would argue that password policies are very context dependent. As much as I detest changing my password every 90 days, I've worked in places where the culture encouraged password sharing. That sharing creates a whole slew of problems. On top of that, removing the requirement to change passwords every 90 days would encourage very few people to select secure passwords, mostly because they prefer convenience and do not understand the risks.
If you are dealing with an externally facing service where people are willing to choose secure passwords and unwilling to share them, I would agree that regularly changing passwords creates more problems than it solves.
When you don’t require them to change it, you can just assign them a random 16 character string and tell them it’s their job to memorize it.
Definitely, though I have seen other solutions, like inserting non-printable characters in the problematic strings (e.g. "/etc/ho<b></b>sts" or whatever, you get the idea). And honestly that seems like a reasonable, if somewhat annoying, workaround to me that still retains the protections.
WAFs are always a bad idea (possible exception: in allow-but-audit mode). If you knew the vulnerabilities you'd protect against them in your application. If you don't know the vulnerabilities all you get is a fuzzy feeling that Someone Else is Taking Care of it, meanwhile the vulnerabilities are still there.
Maybe that's what companies pay for? The feeling?
We changed auditors after that.
I might be out of the loop here, but it seems to me that any WAF that's triggered when the string "/etc/hosts" is literally anywhere in the content of a requested resource, is pretty obviously broken.
A false positive from a conservative evaluation of a query parameter or header value is one thing, conceivably understandable. A false positive due to the content of a blog post is something else altogether.
Rules like this might very well have had incredible positive impact on ten of thousands of websites at the cost of some weird debugging sessions for dozens of programmers (made up numbers obviously).
Oh: I resisted tooth and nail about turning on a WAF at one of my gigs (there was no strict requirement for it, just cargo cult). Turns out - I was right.
Trying to contact support was difficult too due to AI chatbots, but when I finally did reach a human, their "tech support" obviously didn't bother to look at this in any reasonable timeframe.
It wasn't until some random person on Twitter suggested the possibility of some magic string tripping over some stupid security logic that I found the problem and could finally edit my post.
[1] the second time it happened, a colleague added "if we got 403, print "HAHAHA YOU'VE BEEN WAFFED" to our deployment script, and for that I am forever thankful because I saw that error more times than I expected
Oh, I just remembered I had another encounter with the AWS WAF.
I had a Jenkins instance in our cloud account that I was trying to integrate with VSTS (imagine github except developed by Microsoft, and still maintained, nevermind that they own github and it's undoubtedly a better product). Whenever I tried to trigger a build, it worked, but when VSTS did, it failed. Using a REST monitor service I was able to record the exact requests VSTS was making and prove that they work with curl from my machine... after a few nights of experimenting and diffing I noticed a difference between the request VSTS made to the REST monitor and my reproduction with curl: VSTS didn't send a "User-Agent" header, so curl supplied one by default unless I added I think -H "User-Agent:", and therefore did not trigger the first default rule in the AWS WAF, "if your request doesn't list a user agent you're a hacker".
HAHAHA I'VE BEEN WAFFED AGAIN.
How about this: don't run a dumb as rocks Web Application Firewall on an endpoint where people are editing articles that could be about any topic, including discussing the kind of strings that might trigger a dumb as rocks WAF.
This is like when forums about web development implement XSS filters that prevent their members from talking about XSS!
Learn to escape content properly instead.
Ahh, the modern trend of ”unalived”¹ etc. comes to every corner of society eventually.
If that's their idea of security...
It references this CVE https://github.com/tuo4n8/CVE-2023-22047 which allows the reading of system files. The example given shows them reading /etc/passwd
SecRule REQUEST_COOKIES|!REQUEST_COOKIES:/__utm/|REQUEST_COOKIES_NAMES|ARGS_NAMES|ARGS|XML:/* "@pmFromFile lfi-os-files.data"
SecRule REQUEST_COOKIES|!REQUEST_COOKIES:/__utm/|REQUEST_COOKIES_NAMES|ARGS_NAMES|ARGS|XML:/* "@pmFromFile unix-shell.data" ...
where the referenced files contain the usual list of *nix suspects including the offending filename (lfi-os-files.data, "local file inclusion" attacks)The advantage (whack-a-mole notwithstanding) of a WAF is it orders of magnitude easier to tweak WAF rules than upgrade say, Weblogic, or other teetering piles of middleware.
Turns out the hunches were right all along.
But it doesn't. This case highlights a bug, a stupid bug. This case highlights that people who should know better, don't!
The tension between security and usability is real but this is not it. Tension between security and usability is usually a tradeoff. When you implement good security that inconveniences the user. From simple things like 2FA to locking out the user after 3 failed attempts. Rate limiting to prevent DoS. It's a tradeoff. You increase security to degrade user experience. Or you decrease security to increase user experience.
This is neither. This is both bad security and bad user experience. What's the tension?
One of the authors of the paper has said "WAFs are just speed bump to a determined attacker."
Doors are a speedbump for a car.
Well yeah, sure, doesn't mean I'm going to have an open doorframe or a door without a lock.
This isn't like having a lock on your door, this is like having a cheap, easily pickable padlock on your bank vault. If the vault has a proper lock then the padlock serves no purpose, and if it doesn't then you're screwed regardless.
WAFs can have thousands of rules ranging from basic to the sophisticated, not unlike mechanisms you can deploy at a checkpoint.
Security devices like IDSes or WAFs allow deploying filtering logic without touching an app directly, which can be hard/slow across team boundaries. They can allow retroactive analysis and flagging to a central log analysis team. Being able to investigate whether an adversary came through your door after the fact is powerful, you might even be able to detect a breach if you can filter through enough alerts.
People are more likely to get dismissed for not installing an IDS or WAF than having one. Its effectiveness is orthogonal to the politics of its existence, most of the time.
This is like spam filtering. I'm an anti-spam advocate, so the idea that most people can't discuss spam because even the discussion will set off filters is quite old to me.
People who apologize for email content filtering usually say that spam would be out of control if they didn't have that in place, in spite of no personal experience on their end testing different kinds of filtering.
My email servers filter based on the sending server's configuration: does the EHLO / HELO string resolve in DNS? Does it resolve back to the connecting IP? Does the reverse DNS name resolve to the same IP? Does the delivery have proper SPF / DKIM? Et cetera.
My delivery-based filtering works worlds better than content-based filtering, plus I don't have to constantly update it. Each kind has advantages, but I'd rather occasional spam with no false positives than the chance I'm blocking email because someone used the wrong words.
With web sites and WAF, I think the same applies, and I can understand when people have a small site and don't know or don't have the resources to fix things at the actual content level, but the people running a site like Substack really should know better.
1. Create a new post. 2. Include an Image, set filter to All File types and select "/etc/hosts". 3. You get served with an weird error message box displacing a weird error message. 4. After this the Substack posts editor is broken. Heck, every time i access the Dashboard, it waits forever to build the page.
Did find this text while browsing the source for an error (see original ascii art: https://pastebin.com/iBDsuer7):
SUBSTACK WANTS YOU
TO BUILD A BETTER BUSINESS MODEL FOR WRITING
https://substack.com/jobs
This isn't a tension. This rule should not be applied at the WAF level. It doesn't know that this field is safe from $whatever injection attacks. But the substack backend does. Remove the rule from the WAF (and add it to the backend, where it belongs) and you are just as secure and much more usable. No tension.
People will manage to circumvent the firewall if they want to attack your site. But you will still pay, and get both the DoS vulnerabilities created by the firewall and the new attack vectors in the firewall itself.
https://www.macchaffee.com/blog/2023/wafs/
Of course, Wordpress is basically undefendable, so I'd never ever host it on a machine that has anything else of value (including e.g. db credentials that give access to much more than the public content on the WP installation).
> "How could Substack improve this situation for technical writers?"
They don’t care about (technical) writers. All they care about is building a TikTok clone to “drive discoverability” and make the attention-metrics go up. Chris Best is memeing about it on his own platform. Very gross.
Why would random text be parsed? I read the article but this doesn't make sense to me. They suggested directory transversal but your text shouldn't have anything to do with that and transversal is solved by permission settings
I do understand this appoach. From the defence point of view it makes sense, if you have to create a solution to protect millions of websites it doesn't make sense to tailor it to specifics of a single one
(What's below is written not as "this is how it should be done" but instead "what I understand should be done". To provide context to what I do and do not understand so that my misunderstandings can be more directly addressed)
I understand being over zealous, an abundance of caution. But what I'm confused about is why normal text could lead to an exploit in the first place. I can only understand this being a problem if arbitrary text is being executed. Which that would appear to be a HUGE security hole ripe for exploitation. But path access is handled by privileges. So even if arbitrary text is being executed, how can that lead to exploitation without already having a major security hole?
Maybe I'm not understanding substack? I've only been a reader. But why is the writer not in a container or chroot? If you want you be overly zealous why not use two vms? Put them in a vm to write, once they've edited then run tests and then use that as the image for the userspace in the ephemeral vm that is viewed by readers. Would this still be exploitable? I don't mean image for the whole vm, I really do mean a subimage so you can lock that up.
WAFs blocking the string with the filename then is the "to make sure nobody ever accidentally leaves your gate open, we've replaced it with a concrete wall" solution to this problem. You might never have this problem, and might need to actually use the gate but the vendor/security team has successfully solved their problem of checking off a blocked attack, and the consequences are now your problem
generally think that Substack has done a good thing for its core audience of longform newsletter writer creators who want to be Ben Thompson. however its experience for technical people, for podcasters, for people who want to start multi-channel media brands, and for people who write for reach over revenue (but with optional revenue) has been really poor. (all 4 of these are us with Latent.Space). I've aired all these complaints with them and theyve done nothing, which is their prerogative.
i'd love for "new Substack" to emerge. or "Substack for developers".
He gave a talk on it at WordCamp Asia at the start of last year, although I haven’t heard of any progress recently on it.
1. It's a social media platform with a network that is still easy to extract organic growth from.
2. 99% email deliverability without configuring anything. It's whitelisted everywhere.
I worked on a project where we had to use a WAF for compliance reasons. It was a game of wack-a-mole to fix all the places where standard rules broke the application or blocked legitimate requests.
One notable, and related example is any request with the string "../" was blocked, because it might be a path traversal attack. Of course, it is more common that someone just put a relative path in their document.
* The product provided for blogging/content publishing did a shitty job of configuring WAF rules for its use cases (the utility of a "magic WAF that will just solve all your problems" being out of the picture for now) * The WAF product provided by the cloud platform clearly has shitty, overreaching rules doing arbitrary filtering on arbitrary strings. That filtering absolutely can (and will) break unrelated content if the application behind the WAF is developed with a modicum of security-mindedness. You don't `fopen()` a string input (no, I will not be surprised - yes, sometimes you do `fopen()` a string input - when you are using software that is badly written).
So I am wondering:
1. Was this sent to Substack as a bug - they charge money for their platform, and the inability to store $arbitrary_string on a page you pay for, as a user, is actually a malfunction and disfunction"? It might not be the case "it got once enshittified by a CIO who mandated a WAF of some description to tick a box", it might be the case "we grabbed a WAF from our cloud vendor and haven't reviewed the rules because we had no time". I don't think it would be very difficult for me, as an owner/manager at the blogging platform, to realise that enabling a rule filtering "anything that resembles a Unix system file path or a SQL query" is absolutely stupid for a blogging platform - and go and turn it the hell off at the first user complaint.
2. Similarly - does the cloud vendor know that their WAF refuses requests with such strings in them, and do they have a checkbox for "Kill requests which have any character an Average Joe does not type more frequently than once a week"? There should be a setting for that, and - thinking about the cloud vendor in question - I can't imagine the skill level there would be so low as to not have a config option to turn it off.
So - yes, that's a case of "we enabled a WAF for some compliance/external reasons/big customer who wants a 'my vendor uses a WAF' on their checklist", but also the case of "we enabled a WAF but it's either buggy or we haven't bothered to configure it properly".
To me it feels like this would be 2 emails first ("look, your thing <X> that I pay you money for clearly and blatantly does <shitty thing>, either let me turn it off or turn it off yourself or review it please") - and a blog post about it second.
paxys•10h ago
/etc/hosts
See, HN didn't complain. Does this mean I have hacked into the site? No, Substack (or Cloudflare, wherever the problem is) is run by people who have no idea how text input works.
macspoofing•10h ago
orlp•10h ago
betenoire•9h ago
eli•10h ago
This particular rule is obviously off. I suspect it wasn't intended to apply to the POST payload of user content. Perhaps just URL parameters.
On a big enough website, users are doing weird stuff all the time and it can be tricky to write rules that stop the traffic you don't want while allowing every oddball legitimate request.
SonOfLilit•8h ago
mystifyingpoi•10h ago
That's a very uncharitable view. It's far more likely that they are simply using some WAF with sane defaults and never caught this. They'll fix it and move on.
immibis•7h ago
gav•10h ago
For example, I worked with a client that had a test suite of about 7000 or so strings that should return a 500 error, including /etc/hosts and other ones such as:
We "failed" and were not in compliance as you could make a request containing one of those strings--ignoring that neither Apache, SQL, or Windows were in use.We ended up deploying a WAF to block all these requests, even though it didn't improve security in any meaningful way.
krferriter•9h ago
> We "failed" and were not in compliance as you could make a request containing one of those strings--ignoring that neither Apache, SQL, or Windows were in use.
this causes me pain