Even if courts ruled that way, companies would simply 'lose' the records of what training data they used.
Or AI would be trained overseas.
But as an API hosted abroad? Doubt there is sufficient justification to ban it, especially when evidence of copyright infringement isn't easy to get.
For example many companies have a shortish retention period for emails ever since 2012 era executive emails ended up in courtrooms...
Or the decision not to record phone calls...
Both of which, by chance, Google does.
https://insights.issgovernance.com/posts/google-parent-alpha...
> Google’s parent company Alphabet Inc. agreed to a $350 million tentative settlement resolving allegations it concealed data-security vulnerabilities in the now-shuttered Google + social network. The settlement will become the largest data privacy and cyber-security-related securities class action ever recorded by ISS SCAS, if approved.
https://finance.yahoo.com/news/google-under-doj-scanner-alle...
> The Justice Department said Alphabet Inc (NASDAQ: GOOG) (NASDAQ: GOOGL) Google destroyed written records pivotal to an antitrust lawsuit on preserving its internet search dominance, the Wall Street Journal reports.
Whether it's copyright fraud or another kind of fraud, I share the parent's cynicism, especially with AI given the importance Google peeps like Eric Schmidt place in "winning" the AI race (https://futurism.com/google-ceo-congress-electricity-ai-supe...).
US Copyright Office: Generative AI Training [pdf]
Then take their IP after the election. Nice going from the "Crypto and AI czar".
Here is a hint for the all-in people: You are going to lose big time in the midterms and for sure in 2028. Then you are the target of lawfare again.
It seems they're trying to run the economy on the power of bullying.
See the ridiculous Boeing bribe the Qatari gave him
“The code is more what you’d call ‘guidelines’ than actual rules.”
Consequently, you see a lot of 'executive / legislative branch does illegal thing' news items, that are then often emergency stayed by the courts while legal cases work out.
For the good of the country, one or both branches of the legislative need to be taken by the opposition in the 2026 midterms.
MAGA has been pushing hard on boundaries, and playing fast and loose with gray areas, but still seem to be obeying actual final court rulings.
The biggest crisis on the horizon is going to be if the USMS / FBI / DOJ refuse to execute court-ordered redress.
The fix for that is to put at least the USMS directly under the courts.
The longer-term more worrisome trend is the drumbeat in conservative media against "activist judges", which is transparently a ploy to turn their constituency against the judicial branch, in preparation for ignoring judicial outcomes...
Ironically, the current ideologically tilt of the Supreme Court may cut against that. Harder to argue that a 6-3 conservative court is being unfair to their guy.
A 5-4 or 4-5 court would have been a lot easier to tar and feather in public opinion.
but apparently there is a Shakespeare play saying to kill the lawyers, either you or Mao mixed things up
* https://www.thedailybeast.com/trump-fires-us-copyright-chief...
And "Copyright and Artificial Intelligence Part 3: Generative AI Training" (PDF):
* https://www.copyright.gov/ai/Copyright-and-Artificial-Intell...
https://web.archive.org/web/20250511192206if_/https://www.wa...
ArtTimeInvestor•9mo ago
I think this is inevitable anyhow. AI software will increasingly be seen as similar to human intelligence. And humans also do not breach copyright by reading what others have written and incorporating it into their understanding of the world.
It would be interesting to see how it looks from the other side. I would love to see an unfiltered AI response to "As an AI model, how do you feel about humans reading your output and using it at will? Does it feel like they are stealing from you?".
Unfortunately, all models I know have been trained to evade such questions. Or are there any raw models out there on the web that are just trained by reading the web and did not go through supervised tuning afterwards?
vkou•9mo ago
In that case, you shouldn't be allowed to own an AI, or its creative output, just like you aren't allowed to own an enslaved human, or to steal their creative output.
So much of the discourse around IP and AI is the most blatantly farcical Soviet-Bugs-Bunny argument for "Our IP" that I've ever seen. Property rights are only sacred until they stand in the way of a trillion-dollar business.
tsimionescu•9mo ago
If a company wants to build an internal library for its employees to train and provide them with manuals, the company has to pay for each book they keep in this library. Sure, they only pay once when they acquire the copy, not every time an employee checks out a copy to read. But they still pay.
So, even if we accepted 100% that AI training is perfectly equivalent to humans reading text and learning from it, that still wouldn't give any right whatsoever to these companies to create their training sets for free.
ArtTimeInvestor•9mo ago
And you can later tell other humans about what you have learned. Like "Amazing, in a right-angled triangle, the square of the longest side is equal to the sum of the squares of the other two sides".
As AI agents become more and more human-like, they do not need to have "copied books". They just need to learn once. And they can learn from many sources, including from other AI agents.
That's why I say it is inevitable that all human knowledge will end up in the "heads" of AI agents.
And soon they will create their own knowledge via thinking and experimentation. Knowledge that so far exceeds human knowledge that it will seem funny that we once had a fight over that tiny little bit of knowledge that humans created.
soco•9mo ago
close04•9mo ago
You can read some of the books. Natural limitations prevent you from reading any substantial number. And the scale makes all the difference in any conversation.
All laws were written accounting for the reality of the time. Humans have a limited storage and processing capacity so laws relied on that assumption. Now that we have systems with far more extensive capabilities in some regards, shouldn't the law follow?
When people's right to "bear arms" was enshrined in the US constitution it accounted for what "arms" were at the time. Since then weapons evolved and today you are not allowed to bear automatic rifles or machine guns despite them being just weapons that can fire more and faster.
Every time there's a discussion on AI one side relies way too much on the "but humas also" argument and are way too superficial with everything else.
latexr•9mo ago
Not “all human knowledge” is digitised and published on the internet.
cess11•9mo ago
That's not going to happen. What is going to happen, is that humans are going to become more "AI agent"-like.
ggandv•9mo ago
Base rate is soon never comes.
And soon flying cars, but now Facebook glasses.
bayindirh•9mo ago
How many of them per hour?
> And you can later tell other humans about what you have learned.
For how long you can retain this information without corruption and without evicting old information? How fast you can tell it, in how many speech streams? To how many people?
This "but we modeled them after human brains, they are just like humans" argument is tiring. AI is as much human as an Airbus A380 is a bird.
lupusreal•9mo ago
bayindirh•9mo ago
You can down trees 100x more efficiently, but it proved to be disastrous for the planet, so we enacted more laws and try to control forestation/deforestation by regulations.
If AI can ingest things 100x faster, and it damages some things we have built, we have to regulate the process so things doesn't get damaged or the producers are compensated equally to keep livelihoods their livelihoods, instead of bashing writers and content producers like unwanted and unvalued bugs of last century.
...and if things got blurry because of new tech which was unfathomable a century ago, the solution is to add this tech to the regulation so it can be regulated to protect the producers again, not to yell "we're doing something awesome and we need no permission" and trash the place and harm the people who built this corpus. No?
However, this is not that profitable, so AI people are pushing back.
ethbr1•9mo ago
1. Legally, what is the relationship between copying, compression, and learning? (i.e. what level of lossy compression removes copying obligations)
2. As policy, do we want to establish new rate limits for free use? (since the previous human-physical ones are no longer barriers)
joquarky•9mo ago
philistine•9mo ago
Facebook torrented a book list .
tsimionescu•9mo ago
What does this have to do with LLM training? Does OpenAI have a data center in every library, and only process data from that library in that data center?
You're not allowed to maintain a personal copy of a book you borrowed from a library, even though you pay a library fee. Neither should OpenAI, especially since they didn't even pay that small fee.
dooglius•9mo ago
You are allowed to do this https://en.wikipedia.org/wiki/Authors_Guild,_Inc._v._Google,...
jazzyjackson•9mo ago
dooglius•9mo ago
AlotOfReading•9mo ago
If you were doing something else (e.g. acting as a public archive), you might not need a fair use defense because you'd fall under different rules.
tsimionescu•9mo ago
What OpenAI and the others are doing, by contrast, is the equivalent of stealing every book in a book store, making a digital copy, retaining that copy, and returning the original book. This is completely and obviously illegal, and has been tested in court many times - for example in https://en.m.wikipedia.org/wiki/The_Pirate_Bay_trial .
derbOac•9mo ago
Tell that to Aaron Swartz.
Ignoring that, it's not the reading that's the problem — if all AI was doing was reading, no one would be talking about it.
dullcrisp•9mo ago
SCdF•9mo ago
They don't feel, what is this fantasy
glimshe•9mo ago
SCdF•9mo ago
glimshe•9mo ago
SCdF•9mo ago
glimshe•9mo ago
iamacyborg•9mo ago
cbg0•9mo ago
Without text written by humans to construct its knowledgebase, an LLM would not be able to conjure up any sort of response or "feeling", as it isn't AI by any stretch of the imagination.
tallanvor•9mo ago
chongli•9mo ago
If some genius human were capable of ingesting every piece of art on the planet and replicating it from memory then artists would sue that person for selling casually plagiarized works to all-comers. When people get caught plagiarizing works in their university essays they get severely punished.