AI Slop vs. OSS Security

https://devansh.bearblog.dev/ai-slop/

83•mooreds•2h ago

Comments

pksebben•1h ago

> The model has no concept of truth—only of plausibility.

This is such an important problem to solve, and it feels soluble. Perhaps a layer with heavily biased weights, trained on carefully curated definitional data. If we could train in a sense of truth - even a small one - many of the hallucinatory patterns disappear.

Hats off to the curl maintainers. You are the xkcd jenga block at the base.

jcattle•1h ago

I am assuming that millions of dollars have already been spent trying to get LLMs to hallucinate less.

Even if Problems feel soluble, they often aren't. You might have to invent an entirely new paradigm of text generation to solve the hallucination problem. Or it could be the Collatz Conjecture of LLMs, that it "feels" so possible, but you never really get there.

big-and-small•1h ago

Nuclear fusion was always 30 years away (c)

quikoa•24m ago

It would be nice if nuclear fusion had the AI budget.

pjc50•1h ago

The "fact database" is the old AI solution, e.g. Cycorp; it doesn't quite work either. Knowing what is true is a really hard, unsolved problem in philosophy, see e.g. https://en.wikipedia.org/wiki/Gettier_problem . The secret to modern AI is just to skip that and replace unsolvable epistemology with "LGTM", then sell it to investors.

wongarsu•23m ago

Truth comes from being able to test your assertions. Without that they remain in the realm of plausibility. You can't get from plausibility to truth with better training data, you need to give LLMs better tools to test the truth of their plausible statements before spewing them to the user (and train the models to use them, obviously. But that's not the hard part).

wwfn•1h ago

Wealth generated on top of underpaid labor is a reoccurring theme -- and in this case maybe surprisingly exacerbated by LLMs.

Would this be different if the underlying code had a viral license? If google's infrastructure was built on a GPL'ed libcurl [0], would they have investment in the code/a team with resources to evaluate security reports (slop or otherwise)? Ditto for libxml.

Does GPL help the linux kernel get investment from it's corporate users?

[0] Perhaps an impossible hypothetical. Would google have skipped over the imaginary GPL'ed libcurl or libxml for a more permissively licensed library? And even if they didn't, would a big company's involvement in an openly developed ecosystem create asymmetric funding/goals, a la XMPP or Nix?

big-and-small•1h ago

Copyleft licenses are made to support freedom for everyone and particularly end-users. They only limit freedom of developers / maintainers to exploit the code and users.

> Does GPL help the linux kernel get investment from it's corporate users?

GPL has helped "linux kernel the project" greatly, but companies invest in it out of their self-interest. They want to benefit from upstream improvements and playing nicely by upstreaming changes is just much cheaper than maintaining own kernel fork.

On other side you have companies like Sony that used BSD OS code for their game consoles for decades and contributed shit.

So... Two unrelated things.

wwfn•36m ago

I would have thought supporting libcurl and libxml would also be in a company's self-interest. Is that companies do this for GPL'ed linux kernel but not BSD evidence that strong copyleft licensing limits the extent to which OSS projects are exploited/under-resourced?

dvt•1h ago

> Requiring technical evidence such as screencasts showing reproducibility, integration or unit tests demonstrating the fault, or complete reproduction steps with logs and source code makes it much harder to submit slop.

If this isn't already a requirement, I'm not sure I understand what even non-AI-generated reports look like. Isn't the bare-minimum of CVE reporting a minimally reproducible example? Like, even if you find some function, that for example doesn't do bounds-checking on some array, you can trivially write some unit testing code that's able to break it.

noirscape•1h ago

The problem that is that a lot of CVEs often don't represent "real" vulnerabilities, but merely theoretical ones that could hypothetically be combined to make a real exploit.

Regex exploitation is the forever example to bring up here, as it's generally the main reason that "autofail the CI system the moment an auditing command fails" doesn't work on certain codebases. The reason this happens is because it's trivial to make a string that can waste significant resources to try and do a regex match against, and the moment you have a function that accepts a user-supplied regex pattern, that's suddenly an exploit... which gets a CVE. A lot of projects then have CVEs filed against them because internal functions rely on Regex calls as arguments, even if they're in code the user is flat-out never going to be able interact with (ie. Several dozen layers deep in framework soup there's a regex call somewhere, in a way the user won't be able to access unless a developer several layers up starts breaking the framework they're using in really weird ways on purpose).

The CVE system is just completely broken and barely serves as an indicator of much of anything really. The approval system from what I can tell favors acceptance over rejection, since the people reviewing the initial CVE filing aren't the same people that actively investigate if the CVE is bogus or not and the incentive for the CVE system is literally to encourage companies to give a shit about software security (at the same time, this fact is also often exploited to create beg bounties). CVEs have been filed against software for what amounts to "a computer allows a user to do things on it" even before AI slop made everything worse; the system was questionable in quality 7 years ago at the very least, and is even worse these days.

The only indicator it really gives is that a real security exploit can feel more legitimate if it gets a CVE assigned to it.

bawolff•51m ago

As someone who worked on the recieving end of security reports, often not. They can be surprisingly poorly written.

You sort of want to reject them all, but ocassionally a gem gets submitted which makes you reluctant.

For example, years ago i was responsible for triaging bug bounty reports at a SaaS company i worked at at the time. One of the most interesting reports was that someone found a way to bypass our oauth thing by using a bug in safari that allowed them to bypass most oauth forms. The report was barely understandable written in broken english. The impression i got was they tried to send it to apple but apple ignored them. We ended up rewriting the report and submitting it to apple on there behalf (we made sure the reporter got all credit).

If we ignored poorly written reports we would have missed that. Is it worth it though? I dont know.

hshdhdhehd•47m ago

In the AI age I'd prefer poorly written reports in broken English. Just as long as that doesnt become a known bypass and so the AI is instructed to sound broken.

Jean-Papoulos•1h ago

The solution isn't to block aggressively or to allow everything, but to prioritize. Put accounts older than the AI boom at the top, and allow them to give "referrals", ie stake a part of their own credibility to boost another account on the priority ladder.

Referral systems are very efficient at filtering noise.

skydhash•46m ago

It only works when it’s a true stake. Like loosing privileges when a referral is a dunce. The downside is tribalism.

nixpulvis•15m ago

This is very short sighted.

goalieca•1h ago

> This is the fundamental problem: AI can generate the form of security research without the substance.

I think this is the fundamental problem of LLMs in general. Some of the time looks just enough right to seem legitimate. Luckily the rest of the time it doesn’t.

jsheard•57m ago

The other fundamental problem is that to a grifter, it's not a fundamental problem for the output to be plausible but often wrong. Plausible is all they need.

beeburrt•59m ago

I didn't realize how bleak the future looks, wrt CVE infastructure both MITRE and at National Vuln. Database.

What do other countries do for their stuff like this?

Maro•48m ago

Manufacturing vulnerability submissions that look like real vulnerability submissions, but the vulnerability isn't there and the submitter doesn't understand what it's saying.

It's a cargo cult. Maybe the airplanes will land and bring the goodies!

https://en.wikipedia.org/wiki/Cargo_cult

supriyo-biswas•33m ago

Ironically, even this piece is significantly AI-generated:

- Primarily relies on a single piece of evidence from the curl project, and expands it into multiple paragraphs

- "But here's the gut punch:", "You're not building ... You're addressing ...", "This is the fundamental problem:" and so many other instances of Linkedin-esque writing.

- The listicle under "What Might Actually Work"

thadt•25m ago

Yesterday my wife burst into my office: "You used AI to generate that (podcast) episode summary, we don't sound like that!"

In point of fact, I had not.

After the security reporting issue, the next problem on the list is "trust in other people's writing".

bob1029•16m ago

I think one potential downside of using LLMs or exposing yourself to their generated content is that you may subconsciously adopt their quirks over time. Even if you aren't actively using AI for a particular task, prior exposure to their outputs could be biasing your thoughts.

This has additional layers to it as well. For example, I actively avoid using em dash or anything that resembles it right now. If I had no exposure to the drama around AI, I wouldn't even be thinking about this. I am constraining my writing simply to avoid the implication.

code51•5m ago

Exactly and this is hell for programming.

You don't know whose style the LLM would pick for that particular prompt and project. You might end up with Carmack or maybe that buggy, test-failing piece of junk project on Github.

nixpulvis•16m ago

I'm so sick of people claiming things sound like AI, when it's so easily not true.

Between this and the flip side of AI-slop it's getting really frustrating out here online.

hacb•22m ago

I don't know why submitting a vulnerability on those platforms is still free. If reporters had to pay a little amount of money (let's say, 20-50$, or indexed to the maximum gain of a vulnerability in a given category) when submitting their report, maybe those would be of better quality.

I know that this poses new problems (some people can't afford to spend this money), but it would be better than just wasting people's time.

Chilinot•21m ago

This point is brought up and discussed in the linked article.

nixpulvis•21m ago

I believe "Economic Friction" is truly the best and potentially strongest way to deal with this.

It's good for the site collecting the fee, it's good for the projects being reported on and it doesn't negatively affect valid reports.

It does exactly what we want by disincentivizing bad reports, either AI generated or not.

"Operation Chargeback": Large-scale fraud with credit card data uncovered

Sarepta: Enough, for God's Sake [Science]

Coding on Paper

Discrete Substrate Model: CMB and Galaxy Rotation Without Darkmatter

Why Vaadin Is Perfect for AI-Driven Development

Counting to Five for the Government in the Tariffs Case

End of The Line: how Saudi Arabia's Neom dream unravelled

Towards Humanist Superintelligence

AI Power Users Are Impressing Bosses and Leaving Co-Workers in the Dust

Meta tumbles as investors compare its AI spending splurge to metaverse missteps

Ask HN: How do you develop large complex software

Announcing Support for Complex Attribute Types in OTel

Google vs. FFmpeg – a showdown over AI-found bugs and who must fix them

I Take Math Tests with Double Vision

Chimpanzees rationally revise their beliefs

Games for rehab: Fast communication for interactive VR and AR

AI may fatally wound web's ad model, warns Tim Berners-Lee

OpenAPI won't make your APIs AI-ready. But Arazzo can

The Filesystem for Agents

ADHD and Reading: Conquer Struggles, Boost Comprehension, Enjoy Books

Health and Criminal Consequences of Involuntary Hospitalization [pdf]

I made a site that certifies every project as Fast (fast)

Diffusion,Language,and Emergence of Musical Creativity:An Experiment with Suno

Mastodon 4.5

Google plans to put datacentres in space to meet demand for AI

The Faery Tale Adventure – Amiga

Weighted Quantile Weirdness and Bugs

Cutting Tariffs and Red Tape for Rooftop Solar

Charmbracelet/mods: AI on the command line

Hiring the Joker