AI Slop vs. OSS Security

https://devansh.bearblog.dev/ai-slop/

97•mooreds•2h ago

Comments

pksebben•2h ago

> The model has no concept of truth—only of plausibility.

This is such an important problem to solve, and it feels soluble. Perhaps a layer with heavily biased weights, trained on carefully curated definitional data. If we could train in a sense of truth - even a small one - many of the hallucinatory patterns disappear.

Hats off to the curl maintainers. You are the xkcd jenga block at the base.

jcattle•2h ago

I am assuming that millions of dollars have already been spent trying to get LLMs to hallucinate less.

Even if Problems feel soluble, they often aren't. You might have to invent an entirely new paradigm of text generation to solve the hallucination problem. Or it could be the Collatz Conjecture of LLMs, that it "feels" so possible, but you never really get there.

big-and-small•1h ago

Nuclear fusion was always 30 years away (c)

quikoa•1h ago

It would be nice if nuclear fusion had the AI budget.

Cthulhu_•14m ago

Fusion will at best have a few dozen sales once it's commercially viable and then take decades to realise, but you can sell AI stuff to millions of customers for $20 / month each and do it today.

pjc50•1h ago

The "fact database" is the old AI solution, e.g. Cycorp; it doesn't quite work either. Knowing what is true is a really hard, unsolved problem in philosophy, see e.g. https://en.wikipedia.org/wiki/Gettier_problem . The secret to modern AI is just to skip that and replace unsolvable epistemology with "LGTM", then sell it to investors.

wongarsu•59m ago

Truth comes from being able to test your assertions. Without that they remain in the realm of plausibility. You can't get from plausibility to truth with better training data, you need to give LLMs better tools to test the truth of their plausible statements before spewing them to the user (and train the models to use them, obviously. But that's not the hard part).

wwfn•2h ago

Wealth generated on top of underpaid labor is a reoccurring theme -- and in this case maybe surprisingly exacerbated by LLMs.

Would this be different if the underlying code had a viral license? If google's infrastructure was built on a GPL'ed libcurl [0], would they have investment in the code/a team with resources to evaluate security reports (slop or otherwise)? Ditto for libxml.

Does GPL help the linux kernel get investment from it's corporate users?

[0] Perhaps an impossible hypothetical. Would google have skipped over the imaginary GPL'ed libcurl or libxml for a more permissively licensed library? And even if they didn't, would a big company's involvement in an openly developed ecosystem create asymmetric funding/goals, a la XMPP or Nix?

big-and-small•2h ago

Copyleft licenses are made to support freedom for everyone and particularly end-users. They only limit freedom of developers / maintainers to exploit the code and users.

> Does GPL help the linux kernel get investment from it's corporate users?

GPL has helped "linux kernel the project" greatly, but companies invest in it out of their self-interest. They want to benefit from upstream improvements and playing nicely by upstreaming changes is just much cheaper than maintaining own kernel fork.

On other side you have companies like Sony that used BSD OS code for their game consoles for decades and contributed shit.

So... Two unrelated things.

wwfn•1h ago

I would have thought supporting libcurl and libxml would also be in a company's self-interest. Is that companies do this for GPL'ed linux kernel but not BSD evidence that strong copyleft licensing limits the extent to which OSS projects are exploited/under-resourced?

dvt•2h ago

> Requiring technical evidence such as screencasts showing reproducibility, integration or unit tests demonstrating the fault, or complete reproduction steps with logs and source code makes it much harder to submit slop.

If this isn't already a requirement, I'm not sure I understand what even non-AI-generated reports look like. Isn't the bare-minimum of CVE reporting a minimally reproducible example? Like, even if you find some function, that for example doesn't do bounds-checking on some array, you can trivially write some unit testing code that's able to break it.

noirscape•1h ago

The problem that is that a lot of CVEs often don't represent "real" vulnerabilities, but merely theoretical ones that could hypothetically be combined to make a real exploit.

Regex exploitation is the forever example to bring up here, as it's generally the main reason that "autofail the CI system the moment an auditing command fails" doesn't work on certain codebases. The reason this happens is because it's trivial to make a string that can waste significant resources to try and do a regex match against, and the moment you have a function that accepts a user-supplied regex pattern, that's suddenly an exploit... which gets a CVE. A lot of projects then have CVEs filed against them because internal functions rely on Regex calls as arguments, even if they're in code the user is flat-out never going to be able interact with (ie. Several dozen layers deep in framework soup there's a regex call somewhere, in a way the user won't be able to access unless a developer several layers up starts breaking the framework they're using in really weird ways on purpose).

The CVE system is just completely broken and barely serves as an indicator of much of anything really. The approval system from what I can tell favors acceptance over rejection, since the people reviewing the initial CVE filing aren't the same people that actively investigate if the CVE is bogus or not and the incentive for the CVE system is literally to encourage companies to give a shit about software security (at the same time, this fact is also often exploited to create beg bounties). CVEs have been filed against software for what amounts to "a computer allows a user to do things on it" even before AI slop made everything worse; the system was questionable in quality 7 years ago at the very least, and is even worse these days.

The only indicator it really gives is that a real security exploit can feel more legitimate if it gets a CVE assigned to it.

bawolff•1h ago

As someone who worked on the recieving end of security reports, often not. They can be surprisingly poorly written.

You sort of want to reject them all, but ocassionally a gem gets submitted which makes you reluctant.

For example, years ago i was responsible for triaging bug bounty reports at a SaaS company i worked at at the time. One of the most interesting reports was that someone found a way to bypass our oauth thing by using a bug in safari that allowed them to bypass most oauth forms. The report was barely understandable written in broken english. The impression i got was they tried to send it to apple but apple ignored them. We ended up rewriting the report and submitting it to apple on there behalf (we made sure the reporter got all credit).

If we ignored poorly written reports we would have missed that. Is it worth it though? I dont know.

hshdhdhehd•1h ago

In the AI age I'd prefer poorly written reports in broken English. Just as long as that doesnt become a known bypass and so the AI is instructed to sound broken.

Jean-Papoulos•1h ago

The solution isn't to block aggressively or to allow everything, but to prioritize. Put accounts older than the AI boom at the top, and allow them to give "referrals", ie stake a part of their own credibility to boost another account on the priority ladder.

Referral systems are very efficient at filtering noise.

skydhash•1h ago

It only works when it’s a true stake. Like loosing privileges when a referral is a dunce. The downside is tribalism.

nixpulvis•52m ago

This is very short sighted.

goalieca•1h ago

> This is the fundamental problem: AI can generate the form of security research without the substance.

I think this is the fundamental problem of LLMs in general. Some of the time looks just enough right to seem legitimate. Luckily the rest of the time it doesn’t.

jsheard•1h ago

The other fundamental problem is that to a grifter, it's not a fundamental problem for the output to be plausible but often wrong. Plausible is all they need.

gdulli•13m ago

That's an important one. Another fundamental problem with plausible output is that it makes a manager, or a junior, or some other unsophisticated end user think the technology is almost there, and a reliably correct version is just around the corner.

beeburrt•1h ago

I didn't realize how bleak the future looks, wrt CVE infastructure both MITRE and at National Vuln. Database.

What do other countries do for their stuff like this?

Maro•1h ago

Manufacturing vulnerability submissions that look like real vulnerability submissions, but the vulnerability isn't there and the submitter doesn't understand what it's saying.

It's a cargo cult. Maybe the airplanes will land and bring the goodies!

https://en.wikipedia.org/wiki/Cargo_cult

supriyo-biswas•1h ago

Ironically, even this piece is significantly AI-generated:

- Primarily relies on a single piece of evidence from the curl project, and expands it into multiple paragraphs

- "But here's the gut punch:", "You're not building ... You're addressing ...", "This is the fundamental problem:" and so many other instances of Linkedin-esque writing.

- The listicle under "What Might Actually Work"

thadt•1h ago

Yesterday my wife burst into my office: "You used AI to generate that (podcast) episode summary, we don't sound like that!"

In point of fact, I had not.

After the security reporting issue, the next problem on the list is "trust in other people's writing".

bob1029•52m ago

I think one potential downside of using LLMs or exposing yourself to their generated content is that you may subconsciously adopt their quirks over time. Even if you aren't actively using AI for a particular task, prior exposure to their outputs could be biasing your thoughts.

This has additional layers to it as well. For example, I actively avoid using em dash or anything that resembles it right now. If I had no exposure to the drama around AI, I wouldn't even be thinking about this. I am constraining my writing simply to avoid the implication.

code51•41m ago

Exactly and this is hell for programming.

You don't know whose style the LLM would pick for that particular prompt and project. You might end up with Carmack or maybe that buggy, test-failing piece of junk project on Github.

nixpulvis•52m ago

I'm so sick of people claiming things sound like AI, when it's so easily not true.

Between this and the flip side of AI-slop it's getting really frustrating out here online.

chemotaxis•23m ago

I think people sometimes jump the gun over small things (emdashes, etc). That said, in this instance, your anger is very likely misdirected. The article is almost certainly substantially AI-generated.

progbits•9m ago

From the article itself (presumably added later):

  Disclosure: Certain sections of this content were grammatically refined/updated using AI assistance

Cthulhu_•25m ago

Using the word is implies you have definite, conclusive proof, but the only one is a number of phrases that you believe are tells for AI generated stuff, but is it really or are you only now paying extra attention to it?

It's better to stay neutral and say you suspect it may be AI generated.

And for everyone else, responsible disclosure of using AI tools to write stuff would be appreciated.

(this comment did not involve AI. I don't know how to write an emdash)

cheesecompiler•14m ago

My least favourite part of this timeline: anyone who writes well gets classified as AI. Some of us press Option+- to insert an em dash and have been for years.

progbits•8m ago

GP did not use emdash as evidence of AI but rather the verbose style with little signal/noise ratio and the typical phrases most LLMs like to use.

hacb•59m ago

I don't know why submitting a vulnerability on those platforms is still free. If reporters had to pay a little amount of money (let's say, 20-50$, or indexed to the maximum gain of a vulnerability in a given category) when submitting their report, maybe those would be of better quality.

I know that this poses new problems (some people can't afford to spend this money), but it would be better than just wasting people's time.

Chilinot•58m ago

This point is brought up and discussed in the linked article.

nixpulvis•57m ago

I believe "Economic Friction" is truly the best and potentially strongest way to deal with this.

It's good for the site collecting the fee, it's good for the projects being reported on and it doesn't negatively affect valid reports.

It does exactly what we want by disincentivizing bad reports, either AI generated or not.

mmsc•39m ago

>First, the typical AI-powered reporter, especially one just pasting GPT output into a submission form, neither knows enough about the actual codebase being examined nor understands the security implications well enough to provide insight that projects need.

How ironic, considering every time I've reported a complicated issue to a program on HackerOne, the triggers have completely rejected them because they do not understand the complicated codebase that they are triaging for.

Also the curl examples given in TFA completely ignore recent developments, where curl's maintainers welcomed and fixed literally hundred of AI-found bugs: https://www.theregister.com/2025/10/02/curl_project_swamped_...

heymax054•38m ago

Semi-related: I posted this yesterday: https://news.ycombinator.com/item?id=45828917

Different models perform differently when it comes to catching/fixing security vulnerabilities.

samlinnfer•29m ago

Just add a country IP ban, we all know who is submitting these reports. Remember Hacktoberfest?

Cthulhu_•17m ago

That's a game of whack-a-mole, they'd just use a VPN. And besides, no we don't "all know" who is submitting these reports, that's a generalization.

mschuster91•10m ago

> Just add a country IP ban, we all know who is submitting these reports.

As much as I'd like to see Russia, China and India disconnected off of the wide Internet until they clean up shop with abusive actors, the Hacktoberfest stuff you're likely referring to doesn't have anything to do with your implication - that was just a chance at a free t-shirt [1] that caused all the noise.

In ye olde times, you'd need to take care how you behaved in public because pulling off a stunt like that could reasonably lead to your company going out of business - but even a "small" company like DO is too big to fail from FAFO, much less ultra large corporations like Google that just run on sheer moat. IMHO, that is where we have to start - break up the giants, maybe that's enough of a warning signal to also alert "smaller" large companies to behave like citizens again.

[1] https://domenic.me/hacktoberfest/

Cthulhu_•27m ago

Very blunt maybe, but if individuals try to get internet points by doing frivolous security reports under their own name, should they be loudly pinned to a Wall of Shame to discourage the practice?

cheesecompiler•13m ago

> The problem isn't knowledge—it's incentives.

> When you're volunteering out of love in a market society, you're setting yourself up to be exploited.

I sound like a broken record but there's unifying causes to most issues I observe in the world.

None of the proposed solutions address the cause (and they can't of course): public scrutiny doesn't do anything if account creation is zero-effort; monetary penalization will kill the submissions entirely.

progbits•4m ago

  Certain sections of this content were grammatically refined/updated using AI assistance, as English is not my first language.

OP: I sympathize, but I would much rather read your original text, with typos and grammatical errors. By feeding it through the LLM you fix issues that are not really important but remove your own voice and get a bland slop identical to 90% of these slopblogs (which your's isn't!)

Open Source Implementation of Apple's Private Compute Cloud

I analyzed the lineups at the most popular nightclubs

Show HN: See chords as flags – Visual harmony of top composers on musescore

Mathematical exploration and discovery at scale

Ratatui – App Showcase

The trust collapse: Infinite AI content is awful

Cloudflare Tells U.S. Govt That Foreign Site Blocking Efforts Are Trade Barriers

AI Slop vs. OSS Security

Solarpunk is happening in Africa

How I am deeply integrating Emacs

Staying opinionated as you grow

IKEA launches new smart home range with 21 Matter-compatible products

Musik magazine archives (1995-2003)

How often does Python allocate?

Pico-100BASE-TX: Bit-Banged 100 MBit/s Ethernet and UDP Framer for RP2040/RP2350

Dillo, a multi-platform graphical web browser

End of Japanese community

ChatGPT terms disallow its use in providing legal and medical advice to others

Firefox profiles: Private, focused spaces for all the ways you browse

Eating Stinging Nettles

Why aren't smart people happier?

Recursive macros in C, demystified (once the ugly crying stops)

Show HN: Flutter_compositions: Vue-inspired reactive building blocks for Flutter

The Basic Laws of Human Stupidity (1987) [pdf]

Ruby and Its Neighbors: Smalltalk

A new oral history interview with Ken Thompson

New gel restores dental enamel and could revolutionise tooth repair

Carice TC2 – A non-digital electric car

I want a good parallel language [video]

The shadows lurking in the equations

AI Slop vs. OSS Security

Comments

Open Source Implementation of Apple's Private Compute Cloud

I analyzed the lineups at the most popular nightclubs

Show HN: See chords as flags – Visual harmony of top composers on musescore

Mathematical exploration and discovery at scale

Ratatui – App Showcase

The trust collapse: Infinite AI content is awful

Cloudflare Tells U.S. Govt That Foreign Site Blocking Efforts Are Trade Barriers

AI Slop vs. OSS Security

Solarpunk is happening in Africa

How I am deeply integrating Emacs

Staying opinionated as you grow

IKEA launches new smart home range with 21 Matter-compatible products

Musik magazine archives (1995-2003)

How often does Python allocate?

Pico-100BASE-TX: Bit-Banged 100 MBit/s Ethernet and UDP Framer for RP2040/RP2350

Dillo, a multi-platform graphical web browser

End of Japanese community

ChatGPT terms disallow its use in providing legal and medical advice to others

Firefox profiles: Private, focused spaces for all the ways you browse

Eating Stinging Nettles

Why aren't smart people happier?

Recursive macros in C, demystified (once the ugly crying stops)

Show HN: Flutter_compositions: Vue-inspired reactive building blocks for Flutter

The Basic Laws of Human Stupidity (1987) [pdf]

Ruby and Its Neighbors: Smalltalk

A new oral history interview with Ken Thompson

New gel restores dental enamel and could revolutionise tooth repair

Carice TC2 – A non-digital electric car

I want a good parallel language [video]

The shadows lurking in the equations