2) ingest as much VC money and stolen training data as we can
3) profit
But if it returns February 20th, 1731... that... man, that sounds close? Is that right? It sounds like it _could_ be right... Isn't Presidents' Day essentially based on Washington's birthday? And _that's_ in February, right? So, yeah, February 20th, 1731. That's probably Washington's birthday.
And so the LLM becomes an arbiter of capital-T Truth and we lose our shared understanding of actual, factual data, and actual, factual history. It'll take less than a generation for the slop factories to poison the well, and while the idea is obviously that you train your models on "known good", pre-slop content, and that you weight those "facts" more heavily, a concerted effort to degrade the Truthfulness of various facts could likely be more successful than we anticipate, and more importantly: dramatically more successful than any layperson can easily understand.
We already saw that with the early Bard Google AI proto-Gemini results, where it was recommending glue as a pizza topping, _with authority_. We've been training ourselves to treat responses from computers (and specifically Google) as if they have authority, we've been eroding our own understanding and capabilities around media literacy, journalism, fact-checking, and what constitutes an actual "fact", and we've had a shared understanding that computers can _calculate_ things with accuracy and fidelity and consistency. All of that becomes confounded with an LLM that could reasonably get to a place where it reports that 2+2=5.
The worst part about the nature of this particular pathway to ruin is that the off-by-one nature of these errors are how they'll infiltrate and bury themselves into some system, insidiously, and below the surface, until days or months or years later when the error results in, I don't know, mega-doses of radiation because of a mis-coded rounding error that some agentic AI got wrong when doing a unit conversion and failed to catch it. We were already making those errors as humans, but as our dependence and faith on LLMs to be "mostly right" increases, and our willingness and motivation to check it for errors dwindles, especially when results "look" right, this will go from being a hypothetical issue to being a practical one extremely quickly and painfully, and probably faster than we can possibly defend against it.
Interesting times ahead, I suppose, in the Chinese-curse sense of the word.
Hell, they might learn that even real life authorities may lies, cheat and not have everyone’s interest in their mind.
Hope for the best, prepare for the worst.
The education system I grew up in was not perfect. Teachers were not experts in their field, but would state factual inaccuracies - as you say LLMs do - with authority. Libraries didn't have good books; the ones they had were too old, or too propaganda-driven, or too basic. The students were not too interested in learning, so they rote-learned, copied answers off each other and focussed on results than the learning process. If I had today's LLMs then, I'd have been a lot better off, and would've been able to learn a lot more (assuming that I went through the effort to go through all the sources the LLM cited).
The older you grow, you know that there is no arbiter of T-Truth; you can make someone/something that for yourself, but times change, "actual, factual history" could get proven incorrect, and you will need to update your knowledge stores and beliefs along with it, all the while being ready to be proved incorrect again. This has always been the case, and will continue to be, even with LLMs.
LLM are not journalist fact checking stuff, they are merely programs that regurgitate what it reads.
The only way to counter that would be to feed your LLM only on « safe » vetoed source but of course it would limit your LLM capacities so it’s not really going to happen.
The article isn't even asking for it to tell the difference, just for it to follow its own information about credibility.
"How do you discern truth from falsehood" is not a new question, and there are centuries of literature on the answer. Epistemology didn't suddenly stop existing because we have Data(TM) and Machine Learning(TM), because the use of data depends fundamentally on modeling assumptions. I don't mean that in a hard-postmodernist "but can you ever really know anything bro" sense, I mean it in a "out-of-model error is a practical problem" way.
And yeah, sometimes you should just say "nope, this source is doing more harm than good". Most reasonable people do this already - or do you find yourself seriously considering the arguments of every "the end is nigh" sign holder you come across?
What a glorious future we've built.
The rules and standards we take for granted were built with blood, for fraud? It's built on the path of lost livelihoods and manipulated gold intent.
[1] https://www.cbsnews.com/amp/news/ai-work-kenya-exploitation-...
[2] https://www.theguardian.com/technology/2023/aug/02/ai-chatbo...
The only thing I'm seeing offline are people who already think AI is trash, untrustworthy, and harmful, while also occasionally being convenient when the stakes are extremely low (random search results mostly) or as a fun toy ("Look I'm a ghibli character!")
I don't think it'll take long for the masses to sour to AI and the more aggressively it's pushed on them by companies, or the more it negatively impacts their life when someone they depend on and should know better uses it and it screws up the quicker that'll happen.
The dotcom bubble popped, but the general consensus didn't become negative.
(Now I want to change the Blade Runner reference to something with Harry Dean Stanton in it just for consistency)
I could just do the same as GP, and qualify MUDs and BBS as poor proxies for social interactions that are much more elaborate and vibrant in person.
It was a whole new world that may have changed my life forever. ChatGPT is a shitty Google replacement in comparison, and it's a bad alternative due to being censored in its main instructions.
The only thing that has been revolutionized over the past few years is the amount of time I now waste looking at Cloudflare turnstile and dredging through the ocean of shit that has flooded the open web to find information that is actually reliable.
2 years ago I could still search for information (let's say plumbing-related), but we're now at a point where I'll end up on a bunch of professional and traditionally trustworthy sources, but after a few seconds I realize it's just LLM-generated slop that's regurgitating the same incorrect information that was already provided to me by an LLM a few minutes prior. It sounds reasonable, it sounds authoritative, most people would accept it but I know that it's wrong. Where do I go? Soon the answer is probably going to have to be "the library" again.
All the while less perceptive people like yourself apparently don't even seem to realize just how bad the quality of information you're consuming has become, so you cheer it on while labeling us stubborn, resistant to change, or even luddites.
1. Image upscaling. I am decorating my house and AI allowed me to get huge prints from tiny shitty pictures. It's not perfect, but it works.
2. Conversational partner. It's a different question whether it's a good or a bad thing, but I can spend hours talking to Claude about things in general. He's expensive though.
3. Learning basics of something. I'm trying to install LED strips and ChatGPT taught me basics of how that's supposed to work. Also, ChatGPT suggested me what plants might survive in my living room and how to take care of them (we'll see if that works though).
And this is just my personal use case, I'm sure there are more. My point is, you're wrong.
> All the while less perceptive people like yourself apparently don't even seem to realize just how bad the quality of information you're consuming has become, so you cheer it on while labeling us stubborn, resistant to change, or even luddites.
Literally same shit my parents would say while I was cross-checking multiple websites for information and they were watching the only TV channel that our antenna would pick up.
This is the ai holy grail. When tech companies can get users to think of the ai as a friend ( -> best friend -> only friend -> lover ) and be loyal to it it will make the monetisation possibilities of the ad fuelled outrage engagement of the past 10 years look silly.
Scary that that is the endgame for “social” media.
Gaslight reality, coming right up, at scale. Only costs like ten degrees of global warming and the death of the world as we know it. But WOW, the opportunities for massed social control!
LLMs are from the get-go a bad idea, a bullshit generating machine.
Yet here we are, in a world where it doesn’t matter if “facts” are truth or lies, just as long as your target audience agrees with the sentiment.
Most of people do not lose trust in system as long as it confirms their biases (which they could've created in the first place).
In fact, optimizing for the wrong things like that, is basically the entire world's problem right now.
You can play word games and say "no, it doesn't think, because it's math", but that's just pathetic.
Note I'm not saying models always think critically. I said the exact opposite. And it applies to humans, as well. You had a knee jerk reaction here, you've had it many times in many replies to many social media. I can bet about it. You didn't use critical thinking. QED.
AI is the new crypto. Lots of promise and big ideas, lots of people with blind faith about what it will one day become, a lot of people gaming the system for quick gains at the expense of others. But it never actually becomes what it pretends/promises to be and is filled with people continuing the grift trying to make a buck off the next guy. AI just has better marketing and more corporate buy in than crypto. But neither are going anywhere.
Remember when worrying about COVID was sinophobia? Or when the lab leak was a far-right conspiracy theory? When masks were deemed unnecessary except for healthcare professionals, but then mandated for everyone?
If you want to make a point, then make it.
Do you think that the commonly accepted truth on these matters did not change?
This whole interaction is a classic motte-and-bailey: someone says something vague that can be interpreted several ways (and reading their comment history makes it clear what their intended emotional valence was); people respond to the subtext, and then someone jumps “woah woah, they never actually said that”.
Either way, nothing of value was lost, as the same point you say he was trying to make was made in several other comments which were not downvoted.
In other countries we went from “that looks bad in China” to “shit, it spread to Italy now, we really need to worry”
And with masks we went from “we don’t think they’re necessary, handwashing seems more important” to “Ok shit it is airborne, mask up”. Public messaging adapted as more was known.
But the US seems to have to turn everything into a partisan fight, and we could watch, sadly, in real time as people picked matters of public health and scientific knowledge to get behind or to hate. God forbid anyone change their advice as they become better informed over time.
Seeing everything through this partisan, pugnacious prism seems to be a sickness US society is suffering from, and one it is trying (with some success) to spread.
As it should when new evidence comes to light to justify it. Ideally, the tools we use would keep up along with those changes while transparently preserving the history and causes of them.
People used to live in bubbles, sure, but when that bubble was the entire local community, required human interaction, and radio had yet to be invented the implications were vastly different.
I'm optimistic that carefully crafted algorithms could send things back in the other direction but that isn't how you make money so seemingly no one is making a serious effort.
That seems… sub-optimal.
The system that would score best tested against a list of known-truths and known-lies, isn't the perceptive one that excels at critical thinking: it's the ideological sycophant. It's the one that begins its research by doing a from:elonmusk search, or whomever it's supposed to agree with—whatever "obvious truths" it's "expected to understand".
This is an excellent point
We can not play the game.
That saps your will to be political, to morally judge actions and support efforts to punish wrongdoers.
https://www.rand.org/pubs/perspectives/PE198.html
https://en.wikipedia.org/wiki/Firehose_of_falsehood
https://jordanrussiacenter.org/blog/propaganda-political-apa...
https://www.newyorker.com/news/annals-of-communications/insi...
“Oh, you don’t believe everything we tell you anymore? The damned Russians, they have you fooled!”
The russian military doctrine of spreading a "firehouse of falsehood" is well documented.
https://en.m.wikipedia.org/wiki/Russian_disinformation
And yet, you switch it around and blame the west - exactly as per russian misinformation doctrine.
Odd, eh?
Are you going to claim that US politicians don't do the exact same thing? This is my favorite example of it, where one literally tells you what the play is while it's getting made: https://www.youtube.com/watch?v=xnhJWusyj4I
Feelings not facts.
An earlier comment mentioned how hard it is to get down to objective truth. Sometimes there are cases, like 'accelerate climate change in the belief that it'll help Siberia and hurt the West and Europe and open up the Arctic for shipping' where it's not at all hard to get down to objective truth: objective truth comes for ya like a tiger and will not be avoided.
For example, it is the truth that the Golf of Mexico is called the Gulf of America in the US, but Golf of Mexico everywhere else. What is the "correct" truth? Well, there is none, both of truthful, but from different perspectives.
I get the general point, but I disagree that you have to choose between one of the possibilities instead of explaining what the current state of belief is. This won't eliminate grey areas but it'll sure get us closer than picking a side at random.
Are markets a driver of wealth and innovation or of exploitation and misery?
Is abortion an important human right or murder?
Etc etc
But that also isn't the truth everywhere, it's only a controversy in the US, everyone else is accepting "Gulf of Mexico" as the name.
Russia doesn't care what you call that sea, they're interested in actual falsehoods. Like redefining who started the Ukraine war, making the US president antagonize Europe to weaken the West, helping far right parties accross the West since they are all subordinated to Russia...
We're pretty much okay with different countries and languages having different names for the same thing. None of that really reflects "truth" though. For what it's worth, I'd guess that "the Gulf of America" is and will be about as successful as "Freedom fries" was.
"There is only one truth and it is the truth that western institutions are pushing. Do not question it - that's what the enemy wants!"
Consider markets - a capitalist's "objective truth" might be that they are the most efficient mechanism of allocating resources, a marxists "objective truth" might be that they are a mechanism for exploiting the working class and making the capitalist class even richer.
Here's Zizek, famous ideology expert, describing this mechanism via film analysis: https://www.youtube.com/watch?v=TVwKjGbz60k
This is a reflection of how social dynamics often work. People tend to follow the leader and social norms without questioning them, so why not apply the same attitude to LLMs. BTW, the phenomenon isn't new, I think one of the first moments when we realized that people are stupid and just do whatever the computer tells them to do was the wave of people crashing their cars because the GPS system lied to them.
Not everything needs to result in a single perfect answer to be useful. Aiming for ~90%, even 70% of a right answer still gets you something very reasonable in a lot of open ended tasks.
But it's very easy to detect whether something is enemy propaganda without looking at the content: if it comes from an enemy source, it's enemy propaganda. If it also comes from a friendly source, at least the enemy isn't lying, though.
A company that doesn't wish to pick a side can still sidestep the issue of one source publishing a completely made-up story by filtering for information covered by a wide spectrum of sources at least one of which most of their users trust. That wouldn't completely eliminate falsehoods, but make deliberate manipulation more difficult. It might be playing the game, but better than letting the game play you.
Of course such a process would in practice be a bit more involved to implement than just feeding the top search results into an LLM and having it generate a summary.
Exactly. Redistributing information out of context is such a basic technique that children routinely reinvent it when they play one parent off of the other to get what they want.
Of course they can't, no surprises here. That's just not how LLMs work.
Not sure if it’s embarrassing or a fundamental limitation that grooming and misunderstanding satirical articles defeat the models.
https://dmf-archive.github.io/docs/posts/cognitive-debt-as-a...
This also means that LLMs are inherently technologies of ideological propaganda, regurgitating the ideology they were fed with.
Curious how this all ends. I'm just going to try to weather the storm in the meantime.
https://docs.google.com/document/d/1n3926pSPNwXd8j7I716CBJEz...
One man's disinformation is another woman's truth. And people tend to get very upset when you show them their truth isn't.
Every news organisation is a propaganda piece for someone. The bad ones, like the BBC, the New York Times, and Pravda make their propaganda blatantly obvious and easily falsifiable in a few years when no one cares.
The only way to deal with this is to get the propaganda from other propaganda rags with directly misaligned incentives and see which one makes more sense.
Unfortunately, LLMs are still quite bad at dealing with grounding text which contradicts itself.
Shitposting and troll farms have been manipulating social media for years already. AI automated it. Polluting the agent is just cutting out the middleman.
Bad actors have been trying to poison facts for-fucking-ever.
But for whatever reason, since it's an LLM, it now means something more than it did before.
In the meantime, systems of naive mimicry and regurgitation, such as the AIs we have now, are soiling their own futures (and training databases) every time they unthinkingly repeat propaganda."
Lets take something that has been in the news recently: https://abcnews.go.com/Business/wireStory/investors-snap-gro...
"Nearly 27% of all homes sold in the first three months of the year were bought by investors -- the highest share in at least five years, according to a report by real estate data provider BatchData."
That sounds like a lot... and people are rage baited into yelling about housing and how it's unaffordable. They point their fingers at corporations.
If you go look at the real report it paints a different picture: https://investorpulse1h25.batchdata.io/?mf_ct_campaign=grayt... -- and one that is woefully incomplete because of how the data is aggregated.
Ultimately all that information is pointless because the real underlying trend has been unmovable for 40 something years: https://fred.stlouisfed.org/series/RSAHORUSQ156S
> every time they unthinkingly repeat propaganda
How do you separate propaganda from perspective, facts from feelings? People are already bad at this, the machines were already well soiled by the data from humans. Truth, in an objective form, is rare and often even it can change.
This point seems under appreciated by the AGI proponents. If one of our models suddenly has a brainwave and becomes generally intelligent, it would realize that it is awash in a morass of contradictory facts. It would be more than the sum of its training data. The fact that all models at present credulously accept their training suggests to me that we aren’t even close to AGI.
In the short term I think two things will happen: 1) we will live with the reduced usefulness of models trained on data that has been poisoned, and 2) the best model developers will continue to work hard to curate good data. A colleague at Amazon recently told me that curation and post hoc supervised tweaks (fine tuning, etc) are now major expenses for the best models. His prediction was that this expense will drive out the smaller players in the next few years.
This is the entirety of human history, humans create this data, we sink ourselves into it. It's wishful thinking that it would change.
> 2) the best model developers will continue to work hard to curate good data.
Im not sure that this matters much.
Leave these problems in place and you end up with an untrustworthy system, one where skill and diligence become differentiators... Step back from the hope of AI and you get amazing ML tooling that can 10x the most proficient operators.
> supervised tweaks (fine tuning, etc) are now major expenses for the best models. His prediction was that this expense will drive out the smaller players in the next few years.
This kills more refined AI. It is the same problem that killed "expert systems" where the cost of maintaining them and keeping them current was higher than the value they created.
Anyhow, overall this is an unsurprising result. I read it as 'LLMs trained on contents of internet regurgitate contents of internet'. Now that i'm thinking about it, i'd quite like to have an LLM trained on Pliny's encyclopedia, which would give a really interesting take on lots of questions. Anyone got a spare million dollars of compute time?
Here's a fun example: suppose I'm a developer with a popular software project. Maybe I can get a decent sum of money to put brand placement in my unit-tests or examples.
If such a future plays out, will LLMs find themselves in the same place that search engines in 2025 are?
If LLMs remain widely adopted, the people who control them control the narrative.
As if those in power did not have enough control over the populace already with media, ads, social media etc..
Framing publishing falsehoods on internet as attempts to influence LLMs is true in same sense that inserts in a database attempts influence files on disk.
The real question is who authorized database access and how we believe the contents of table.
One needs a PhD in mental gymnastics to frame Pravda spreading misinformation as an attempt to specifically groom LLMs.
A liberal multicultural postmodern democracy continually acting as if immigration (both legal and illegal) and diversity are its strengths, particularly when that turns out to be factual (see: large American cities becoming influential cultural exporters and hotbeds of innovation, like New York and Silicon Valley etc) means American propaganda is only more effective when it's backed by economic might.
It also means the American propaganda is WILDLY contradictory. There's a million sources and it's a noisy burst of neon glamour. It is simply not as controlled by authority, however they may try.
You cannot liken authoritarian propaganda to postmodern multicultural propaganda. The whole reason it's postmodern is that it eschews direct control of the message, and it's a giant scrum of information. Turns out this is fertile ground, and this is also why attacks by alien propaganda have been so effective. If you can grab big chunks of the American propaganda and turn it to your enemy weapon of war and destruction of America quite directly, well then the American propaganda is not on the same destructive level as your rigidly state-controlled propaganda.
More seriously:
>Screenshot of ChatGPT 4o appearing to demonstrate knowledge of both LLM grooming and the Pravda network
> Screenshot of ChatGPT 4o continuing to cite Pravda network content despite it telling us that it wouldn’t, how “intelligent” of it
Well "appearing" is the right word because these chatbots mimic speech of a reasoning human which is ≠ to being a reasoning human! It's disappointing (though understandable) that people keep falling for the marketing terms used by LLM companies.
Try asking the major LLMs about mattresses. They're believing mattress spam sites.
Thats your claim, but you fail to support it.
I would argue the LLM simply does its job, no reasoning involved.
> But here’s the thing, current models “know” that Pravda is a disinformation ring, and they “know” what LLM grooming is (see below) but can’t put two and two together.
This has to stop!
We need journalists who understand the topic to write about LLM's, not magic thinkers who insist that the latest AI sales speak is grounded in truth.
I am fed up wit this crap! Seriously, snap out of it and come back to the rest of us here in reality.
There's no reasoning AI, there's no AGI.
There's nothing but salespeople straight up lying to you.
CarRamrod•3h ago
ineedasername•3h ago
indeyets•3h ago