frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

CoreWeave to buy Core Scientific in $9B deal to meet AI power needs

https://www.reuters.com/legal/transactional/coreweave-acquire-crypto-miner-core-scientific-2025-07-07/
1•dangwu•2m ago•0 comments

Longevity Escape Velocity

https://en.wikipedia.org/wiki/Longevity_escape_velocity
1•ZeljkoS•3m ago•0 comments

Securing GitHub Copilot agent mode and MCP Workflows with runtime guardrails

https://www.tramlines.io/blog/tramlines-io-to-secure-mcp-usage-in-github-copilot-agent-mode
1•coderinsan•3m ago•0 comments

On-the-job learning upended by AI and hybrid work

https://www.ft.com/content/071089b8-839a-4f96-af79-394c08a146d1
1•thm•4m ago•0 comments

Mobile-Friendliness of Flagging Submissions

1•gipp•5m ago•0 comments

SUS Lang: The SUS Hardware Description Language

https://sus-lang.org/
4•nateb2022•7m ago•0 comments

Show HN: Unlearning Comparator, a visual tool to compare machine unlearning

https://gnueaj.github.io/Machine-Unlearning-Comparator/
1•jaeunglee•8m ago•0 comments

Serving 100s of LLMs on 1 GPU with LoRAX [video]

https://www.youtube.com/watch?v=i6zVvfvIFpc
1•codingmoh•9m ago•0 comments

A conversation on Claude Code [video]

https://www.youtube.com/watch?v=Yf_1w00qIKc
1•Bluestein•10m ago•0 comments

EN-ANALYSER – Open-source AI tool for network threat detection and analysis

https://github.com/M4nuel0/ENANALYSER
2•R31S•12m ago•1 comments

Show HN: What if unpaid invoices hurt a company's credit score? Now they do

https://credote.com
2•elenabrooks•12m ago•0 comments

Lithium: Elevating ETL with ephemeral and self-hosted pipelines (2024)

https://www.atlassian.com/blog/atlassian-engineering/lithium
1•mooreds•13m ago•0 comments

AI scores an own goal if you play up and play the game

https://www.theregister.com/2025/07/07/ai_scores_a_huge_own/
1•rntn•13m ago•0 comments

Measles 'out of control,' experts warn, as Alberta case counts surpass 1k

https://www.cbc.ca/news/canada/calgary/alberta-measles-cases-pass-1000-1.7567488
3•bookofjoe•14m ago•0 comments

John Carmack's talk on building AI [video]

https://www.youtube.com/watch?v=4epAfU1FCuQ
2•simonpure•14m ago•1 comments

AI Prompts for Content Creators: Blogs, Videos and Social Media

https://medium.com/@tim_62250/100-ai-prompts-for-content-creators-blogs-videos-social-media-f10f0de00124
1•businessmate•14m ago•0 comments

Google DeepMind has grand ambitions to 'cure all diseases' with AI

https://fortune.com/2025/07/06/deepmind-isomorphic-labs-cure-all-diseases-ai-now-first-human-trials/
2•Brajeshwar•15m ago•0 comments

Archaeologists unveil 3,500-year-old city in Peru

https://www.bbc.com/news/articles/c07dmx38kyeo
1•Brajeshwar•15m ago•0 comments

Homo Crustaceous

https://aeon.co/essays/are-humans-destined-to-evolve-into-crabs
1•Brajeshwar•15m ago•0 comments

NYC Audiences Will See 'Twin Peaks' Season 3 the Way Lynch Intended

https://www.indiewire.com/features/craft/twin-peaks-season-3-theatrical-mix-david-lynch-intended-1235136544/
2•evo_9•15m ago•0 comments

Show HN: Simple Wikiclaudia, a browser extension to simplify Wikipedia articles

https://mattsayar.com/simple-wikiclaudia/
1•MattSayar•16m ago•0 comments

Ball Lightning captured on camera in Alberta

https://www.youtube.com/watch?v=mmOfwFHBu_o
1•polishdude20•17m ago•0 comments

AI Native Software Factories

https://www.heavybit.com/library/podcasts/open-source-ready/ep-17-ai-native-software-factories-with-solomon-hykes
2•gk1•17m ago•0 comments

Inverse Triangle Inequality

https://matklad.github.io/2025/07/07/inverse-triangle-inequality.html
1•mfrw•17m ago•0 comments

Show HN: Interactive pinout for the Raspberry Pi Pico 2

https://pico2.pinout.xyz
2•gadgetoid•22m ago•0 comments

Corporate Action Tracker Browser Extension; 0 User Tracking or Data Collection

https://chromewebstore.google.com/detail/corporate-action-tracker/onanoffjjooamaoiopaailimimpggfaj
1•DocFeind•23m ago•0 comments

The Harvey Edwards Archive

https://www.harveyedwards-archive.com
3•toomuchtodo•23m ago•0 comments

claude-task-master – Task management for AI driven development

https://github.com/eyaltoledano/claude-task-master
1•jhund•24m ago•0 comments

Show HN: Upticker – Financial Intelligence Platform

https://upticker.ai/home
2•ahmedhawas123•24m ago•2 comments

Perspectives on creative well-being of older adults

https://www.sciencedirect.com/science/article/pii/S0890406523000609
1•thunderbong•26m ago•0 comments
Open in hackernews

Anthropic cut up millions of used books, and downloaded 7M pirated ones – judge

https://www.businessinsider.com/anthropic-cut-pirated-millions-used-books-train-claude-copyright-2025-6
171•pyman•7h ago

Comments

pyman•7h ago
Anthropic's cofounder, Ben Mann, downloaded million copies of books from Library Genesis in 2021, fully aware that the material was pirated.

Stealing is stealing. Let's stop with the double standards.

damnesian•6h ago
oh well, the product has a cute name and will make someone a billionaire, let's just give it the green light. who cares about copyright in the age of AI?
originalvichy•6h ago
At least most pirates just consume for personal use. Profiting from piracy is a whole other level beyond just pirating a book.
pyman•6h ago
Someone on Twitter said: "Oh well, P2P mp3 downloads, although illegal, made contributions to the music industry"

That's not what's happening here. People weren't downloading music illegally and reselling it on Claude.ai. And while P2P networks led to some great tech, there's no solid proof they actually improved the music industry.

drcursor•5h ago
Let's not forget Spotify ;)

https://gizmodo.com/early-spotify-was-built-on-pirated-mp3-f...

pyman•4h ago
Those claims were never proved.
Imustaskforhelp•4h ago
I really feel as if Youtube is the best sort of convenience for music videos where most people watch ads whereas some people can use an ad blocker.

I use an adblocker and tbh I think so many people on HN are okay with ad blocking and not piracy when basically both just block the end user from earning money.

I kind of believe that if you really like a software, you really like something. Just ask them what their favourite charity is and donate their or join their patreon/a direct way to support them.

Workaccount2•1h ago
If you are someone who can think clearly, it's extremely obvious that the conversation around copyright, LLMs, piracy, and ad-blocking is

"What serves me personally the best for any given situation" for 95% of people.

mnky9800n•5h ago
I feel like profit was always a central motive of pirates. At least from the historical documents known as, "The Pirates of the Caribbean".
KoolKat23•5h ago
This isn't really profiting from piracy. They don't make money off the raw input data. It's no different to consuming for personal use.

They make money off the model weights, which is fair use (as confirmed by recent case law).

j_w•4h ago
This is absurd. Remove all of the content from the training data that was pirated and what is the quality of the end product now?
pyman•4h ago
With Claude, people are paying Anthropic to access answers that are generated from pirated books, without the authors permission, credit, or compensation.
KoolKat23•4h ago
There is no copyright on knowledge.

If it outputs parts of the book verbatim then that's a different story.

pyman•4h ago
Let's don't change the focus of the debate.

Pirating 7 million books, remixing their content, and using that to power Claude.ai is like counterfeiting 7 million branded products and selling them on your personal website. The original creators don't get credit or payment, and someone’s profiting off their work.

All this happens while authors, many of them teachers, are left scratching their heads with four kids to feed

KoolKat23•3h ago
That may be the case, but you'd have to have laws changed.
KoolKat23•4h ago
That's the law.

Please keep in mind, copyright is intended as a compromise between benefit to society and to the individual.

A thought experiment, students pirating textbooks and applying that knowledge later on in their work?

j_w•4h ago
When you say that's the law, as far as I'm aware a single ruling by a lower court has been issued which upholds that application. Hardly settled case law.
KoolKat23•3h ago
True, until then best to act as if it is the case.

In my opinion, it will be upheld.

Looking at what is stored and the manner which it is stored. It makes sense that it's fair use.

mrcwinn•1h ago
> At least most pirates just consume for personal use.

Easy for the pirate to say. Artists might argue their intent was to trade compensation for one's personal enjoyment of the work.

Workaccount2•1h ago
The gut punch of being a photographer selling your work on display, someone walks by and lines up their phone to take a perfect picture of your photograph, and then exclaims to you "Your work is beautiful! I can't wait to print this out and put it on my wall!"
jobs_throwaway•1h ago
All the evidence shows that piracy is good for artists' business. You make a good work, people are exposed to it through piracy, and they end up buying more of your stuff than they would otherwise. But keep crying about the artist's plight
SketchySeaBeast•39m ago
The way you've presented this, the evidence is just "common sense", which isn't much evidence at all.
x3n0ph3n3•5h ago
Copyright infringement is not stealing.
1oooqooq•5h ago
actually, the Only time it's a (ethical) crime is when a corporation does it at scale for profit.
pyman•5h ago
Pirating a book and selling it on claude.ai is stealing, both legally and morally.
zb3•5h ago
Who got robbed? Just because I'd pay for AI it doesn't mean I'd buy these books.
pyman•4h ago
You should ask the teachers who spent years writing those books.
azangru•43m ago
You keep saying the word "teachers"; but that word does not appear in the text of the article. Why focus on the teachers in particular?

Also, there are various incentives for teachers to publish books. Money is just one of them (I wonder how much revenue books bring to the teachers). Prestige and academic recognition is another. There are probably others still. How realistic is the depiction of a deprived teacher whose livelihood depended on the books he published once every several years?

BlackFly•4h ago
Making a copy differs from taking an existing object in all aspects: literally, technically, legally and ethically. Piracy is making a copy you have no legal right to. Stealing is taking a physical object that you have no legal right to. While the "no legal right to" seems the same superficially, in practice the laws differ quite a bit because the literal, technical and ethical aspects differ.
TiredOfLife•4h ago
They are not selling it on claude.ai. If you can prove that they are you will be rich.
seydor•2h ago
property infringement isn't either?
eviks•20m ago
If you infringe by destroying property, then yes, it's not stealing
impossiblefork•2h ago
It's very similar to theft of service.

There's so many texts, and they're so sparse that if I could copyright a work and never publish it, the restriction would be irrelevant. The probability that you would accidentally come upon something close enough that copyright was relevant is almost infinitesimal.

Because of this copyright is an incredibly weak restriction, and that it is as weak as it is shows clearly that any use of a copyrighted work is due to the convenience that it is available.

That is, it's about making use of the work somebody else has done, not about that restricting you somehow.

Therefore copyright is much more legitimate than ordinary property. Ordinary property, especially ownership of land, can actually limit other people. But since copyright is so sparse infringing on it is like going to world with near-infinite space and picking the precise place where somebody has planted a field and deciding to harvest from that particular field.

Consequently I think copyright infringement might actually be worse than stealing.

jpalawaga•5m ago
you've created a very obvious category mistake in your final summary by confusing intellectual property--which can be copied at no penalty to an owner (except nebulous 'alternate universe' theories)--with actual property, and a farmer and his land, with a crop that cannot be enjoyed twice.

you're saying copying a book is worse than robbing a farmer of his food and/or livelihood, which cannot be replaced to duplicated. Meanwhile, someone who copies a book does not deprive the author of selling the book again (or a tasty proceedings from harvest).

I can't say I agree, for obvious reasons.

Der_Einzige•1h ago
Information wants to be free.
troyvit•11m ago
Then why does Claude cost money?
dathinab•1h ago
stealing with the intent to gain a unfair marked advantage so that you can effectively kill any ethically legally correctly acting company in a way which is very likely going to hurt many authors through the products you create is far worse then just stealing for personal use

that isn't "just" stealing, it's organized crime

1970-01-01•35m ago
Let's get actual definitions of 'theft' before we leap into double standards.
neo__•6h ago
Hopefully they were all good books at least.
pyman•6h ago
they pirated the best ones, according to the authors
pyman•6h ago
These are the people shaping the future of AI? What happened to all the ethical values they love to preach about?

We've held China accountable for counterfeiting products for decades and regulated their exports. So why should Anthropic be allowed to export their products and services after engaging in the same illegal activity?

lofaszvanitt•4h ago
This is the underlying caste system coming to life right before your eyes :D.
stephenitis•4h ago
I think caste system is the wrong analogy here.

Comment is more about the pseudo ethical high ground

MangoToupe•1h ago
Companies being above the law does create a stratified system in this country for those who can benefit from said companies and those who cannot. Call it what you like.
seydor•2h ago
break things and move fast
benjiro•2h ago
One rule for you, one rule for me ...

You never noticed the hypocrite behavior all over society?

* O, you drunk drive, big fine, lots of trouble. * O, you drunk drive and are a senator, cop, mayor, ... Well, lets look the other way.

* You have anger management issues and slam somebody to the ground. Jail time. * You as a cop have anger management issues and slams somebody to the ground. Well, paid time off while we investigate and maybe a reprimand. Qualified immunity boy!

* You tax fraud for 10k, felony record, maybe jail time. * You as a exec of a company do tax fraud for 100 million. After 10 years lawyering around, maybe you get something, maybe, ... o, here is a fine of 5 million.

I am sorry but the idea of everybody being equal under the law has always been a illusion.

We are holding China accountable for counterfeiting products because it hurts OUR companies, and their income. But when its "us vs us", well, then it becomes a bit more messy and in general, those with the biggest backing (as in $$$, economic value, and lawyers), tends to win.

Wait, if somebody steal my book, i can sue that person in court, and get a payout (lawyers will cost me more but that is not the point). If some AI company steals my book, well, the chance you win is close to 1%, simply because lots of well paid lawyers will make your winning hard to impossible.

Our society has always been based upon power, wealth and influence. The more you have of it, the more you get away (or reduced) with things, that gets other fined or jailed.

ffsm8•1h ago
> We've held China accountable for counterfeiting products for decades and regulated their exports

We have? Are we from different multi-verses?

The one I've lived in to date has not done anything against Chinese counterfeits beyond occasionally seizing counterfeit goods during import. But that's merely occasionally enforcing local counterfeit law, a far cry from punishing the entity producing it.

As a matter of fact, the companies started outsourcing everything to China, making further IP theft and quasi-copies even easier

Workaccount2•1h ago
I was gonna say, the enforcement is so weak that it's not even really worth it to pursue consumer hardware here in the US. Make product that is a hit, patent it, and still 1 month later IYTUOP will be selling an identical copy for 1/3rd the price on Amazon.
delfinom•1h ago
Patent enforcement requires the patent holder to go after violators. The said thing is, there are grounds to sue Amazon facilitating it, just nobody has had the money to do it. And no big company ever will because of the threat of being locked out of AWS.

It's quite the mafia operation over at Amazon.

DrillShopper•1h ago
> So why should Anthropic be allowed to export their products and services after engaging in the same illegal activity?

Rules don't apply to corporations making money for VCs.

So it goes.

wmf•40m ago
The unethical ones didn't buy any books.
bmitc•35m ago
Silicon Valley has always been the antithesis of ethics. It's foundations are much more right wing and libertarian, along the extremist lines.
carlosjobim•33m ago
Why is it unethical of them to use the information in all these books? They are clearly not reselling the books in any way, shape, or form. The information itself in a book can never be copyrighted. You can also publish and sell material where you quote other books within it.
ramon156•5h ago
Pirate and pay the fine is probably hell of a lot cheaper than individually buying all these books. I'm not saying this is justified, but what would you have done in their situation?

Sayi "they have the money" is not an argument. It's about the amount of effort that is needed to individually buy, scan, process millions of pages. If that's done for you, why re-do it all?

TimorousBestie•5h ago
150K per work is the maximum fine for willful infringement (which this is).

105B+ is more than Anthropic is worth on paper.

Of course they’re not going to be charged to the fullest extent of the law, they’re not a teenager running Napster in the early 2000s.

voxic11•2h ago
Even if they don't qualify for willful infringement damages (lets say they have a good faith belief their infringement was covered by fair use) the standard statutory damages for copyright infringement are $750-$30,000 per work.
pyman•5h ago
The problem with this thinking is that hundreds of thousands of teachers who spent years writing great, useful books and sharing knowledge and wisdom probably won't sue a billion dollar company for stealing their work. What they'll likely do is stop writing altogether.

I'm against Anthropic stealing teacher's work and discouraging them from ever writing again. Some teachers are already saying this (though probably not in California).

lofaszvanitt•4h ago
They won't be needed anymore, once singularity is reached. This might be their thought process. This also exemplifies that the loathed caste system found in India is indeed in place in western societies.

There is no equality, and seemingly there are worker bees who can be exploited, and there are privileged ones, and of course there are the queens.

pyman•4h ago
:D

Note: My definition of singularity isn't the one they use in San Francisco. It's the moment founders who stole the life's work of thousands of teachers finally go to prison, and their datacentres get seized.

lofaszvanitt•4h ago
You can bet that this never gonna happen...
covercash•1h ago
When the rich and powerful face zero consequences for breaking laws and ignoring the social contracts that keep our society functioning, you wind up with extreme overcorrections. See Luigi.
achierius•45m ago
How extreme is that, really? Not to justify murder: that is clearly bad. But "killing one man" is evidently something we, as a society, consider an "acceptable side-effect" when a corporation does it -- hell, you can kill thousands and get away scot-free if you're big enough.

Luigi was peanuts in comparison.

“THERE were two “Reigns of Terror,” if we would but remember it and consider it; the one wrought murder in hot passion, the other in heartless cold blood; the one lasted mere months, the other had lasted a thousand years; the one inflicted death upon ten thousand persons, the other upon a hundred millions; but our shudders are all for the “horrors” of the minor Terror, the momentary Terror, so to speak; whereas, what is the horror of swift death by the axe, compared with lifelong death from hunger, cold, insult, cruelty, and heart-break? What is swift death by lightning compared with death by slow fire at the stake? A city cemetery could contain the coffins filled by that brief Terror which we have all been so diligently taught to shiver at and mourn over; but all France could hardly contain the coffins filled by that older and real Terror—that unspeakably bitter and awful Terror which none of us has been taught to see in its vastness or pity as it deserves.”

- Mark Twain

SketchySeaBeast•2h ago
> They won't be needed anymore, once singularity is reached.

And it just so happens that that belief says they can burn whatever they want down because something in the future might happen that absolves them of those crimes.

CuriouslyC•4h ago
If you care so little about writing that AI puts you off it, TBH you're probably not a great writer anyhow.

Writers that have an authentic human voice and help people think about things in a new way will be fine for a while yet.

4b11b4•3h ago
Yeah, people will still want to write. They might need new ways to monetize it... that being said, even if people still want to write they may not consider it a viable path. Again, have to consider other monetization.
glimshe•4h ago
That will be sad, although there will still be plenty of great people who will write books anyway.

When it comes to a lot of these teachers, I'll say, copyright work hand in hand with college and school course book mandates. I've seen plenty of teachers making crazy money off students' backs due to these mandates.

A lot of the content taught in undergrad and school hasn't changed in decades or even centuries. I think we have all the books we'll ever need in certain subjects already, but copyright keeps enriching people who write new versions of these.

glimshe•5h ago
Isn't "pirating" a felony with jail time, though? That's what I remember from the FBI warning I had to see at the beginning of every DVD I bought (but not "pirated" ones).
pyman•5h ago
Absolutely.

Pirating 7 million books, remixing their content, and using that to make money on Claude.ai is like counterfeiting 7 million branded products and selling them on your Shopify website. The original creators don't get payment, and someone's profiting off their work. Try doing that yourself and you'd get a knock on the door real quick.

mystified5016•2h ago
No, it isn't.
dmix•2h ago
A court just ruled on Anthropic and said an LLM response wasn't a form of counterfeiting (ie, essentially selling pirate books on the black market). Although tbf that is the most radical interpretation still being put forward by the lawyers of publishers like NYTimes, despite the obvious flaws.
voxic11•2h ago
Yes criminal copyright infringement (willful copyright infringement done for commercial gain or at a large scale) is a felony.
kevingadd•5h ago
Google did it the legal way with Google Books, didn't they?
pyman•4h ago
No, Google did not sell the books through Google Books. Anthropic is selling the transformed version of the books on claude.ai.

Pirating 7 million books, remixing their content, and using that to make money on Claude.ai is like counterfeiting 7 million branded products and selling them on your Shopify website. The original creators don't get payment, and someone's profiting off their work.

suyjuris•4h ago
The judge appears to disagree with you on this. They found that training and selling an LLM are fair use, based on the fact that it is exceedingly transformative, and that the copyright holders are not entitled to any profits thereof due to copyright. (They also did get paid — Anthropic acquired millions of books legally, including all of the authors in this complaint. This would not retroactively absolve them of legal fault for past infringements, of course.)
pyman•4h ago
The trial is scheduled for December 2025. That's when a jury will decide how much Anthropic owes for copying and storing over seven million pirated books
suyjuris•3h ago
Yes, that would by an interesting trial. But it is only about six books, and all claims regarding Claude have been dismissed already. So only the internal copies remain, and there the theory for them being infringing is somewhat convoluted: you have to argue that they are not just for purposes of training (which was ruled fair use), and award damages even though these other purposes never materialised (since by now, they have legal copies of those books). I can see it, but I would not count on there being a trial.
flaptrap•2h ago
The fallacy in the 'fair use' logic is that a person acquires a book and learns from it, but a machine incorporates the text. Copyright does not allow one to create a derivative work without permission. Only when the result of the transformation resembles the original work could it be said that it is subject to copyright. Do not regard either of those legal issues are set in concrete yet.
mensetmanusman•1h ago
Both a human and a machine learn from it. You can design an LLM that doesn’t spit back the entire text after annealing. It just learns the essence like a human.
badmintonbaseba•1h ago
Morally maybe, but AFAIK machines "learning" and creating creative works on their own is not recognized legally, at least certainly not the same way as for people.
Workaccount2•1h ago
>AFAIK machines "learning" and creating creative works on their own is not recognized legally

Did you read the article? The judge literally just legally recognized it.

maeln•4h ago
If you wanted to be legit with 0 chance of going to court, you would contact publisher and ask to pay a license to get access to their catalog for training, and negotiate from that point.

This is what every company using media are doing (think Spotify, Netflix, but also journal, ad agency, ...). I don't know why people in HN are giving a pass to AI company for this kind of behavior.

pyman•4h ago
100%

It's the new narrative in certain circles, especially in San Francisco: it's us vs China. We're doing all this to beat them, no matter the cost. While teachers are left scratching their heads with four kids to feed.

edgineer•1h ago
The paradigm is that teachers will teach life skills like public speaking and entrepreneurship. Book smarts that can be more effectively taught by AI will be, once schools catch up.
ohashi•1h ago
Because they are mostly software developers who think it's different because it impacts them.
suyjuris•4h ago
Just downloading them is of course cheaper, but it is worth pointing out that, as the article states, they did also buy legitimate copies of millions of books. (This includes all the books involved in the lawsuit.) Based on the judgement itself, Anthropic appears to train only on the books legitimately acquired. Used books are quite cheap, after all, and can be bought in bulk.
asadotzler•54m ago
Buying a book is not license to re-sell that content for your own profit. I can't buy a copy of your book, make a million Xeroxes of it and sell those. The license you get when you buy a book is for a single use, not a license to do what ever you want with the contents of that book.
darkoob12•3h ago
This is not about paying for a single copy. It would still be wrong even if they have bought every single one of those books. It is a form of plagiarism. The model will use someone else's idea without proper attribution.
jeroenhd•1h ago
Legally speaking, we don't know that yet. Early signs are pointing at judges allowing this kind of crap because it's almost impossible for most authors to point out what part of the generated slop was originally theirs.
tmaly•2h ago
At minimum they should have to buy the book they are deriving weights from.
bmitc•34m ago
> I'm not saying this is justified, but what would you have done in their situation?

Individuals would have their lives ruined either from massive fines or jail time.

tliltocatl•5h ago
If the AI movement will manage to undermine Imaginary Property, it would redeem it's externalities threefold.
57473m3n7Fur7h3•5h ago
I don’t think that’s gonna happen. I think they will manage to get themselves out of trouble for it, while the rest of us will still face serious problems if we are caught torrenting even one singular little book.
tliltocatl•5h ago
Even so, would be hard to prove that this particular little book wasn't generated by Claude (oopsie, it happens to be a verbatim copy of a copyrighted work, that happens sometimes, those pesky LLMs).
pyman•4h ago
You just need to audit their system. Shouldn't take more than a couple of hours.
2OEH8eoCRo0•2h ago
The Ocean Full of Bowling Balls
ttoinou•5h ago
It would be great, but I think some are worried that new AI BigTech will find a way to continue enforcing IP on the rest of society while it won't exist for them
Imustaskforhelp•5h ago
I think that we are worried because I think that's exactly what's going to happen/ is happening.
karel-3d•5h ago
That would render GPL and friends redundant too... copyleft depends on copyright.
bayindirh•4h ago
What are your feelings about how the small fish is stripped of their arts, and their years of work becomes just a prompt? Mainly comic artists and small musicians who are doing things they like and putting out for people, but not for much money?
tliltocatl•4h ago
"But think about the children". The copyright system is doing too much damage to culture and society. Yes, it does provides a pond for some small fish, but the overall damage outweighs this. Like the fact that first estate provided sustainable for arts and crafts to flourish doesn't make the ancient régime any less screwed up.
bayindirh•2h ago
I think I have worded my question wrong. I asked about not about how AI affects the financials of these smaller artists, but their wellbeing in general.

There are many small artists who do this not for money, but for fun and have their renowned styles. Even their styles are ripped off by these generative AI companies and turned into a slot machine to earn money for themselves. These artists didn't consent to that, and this affects their (mental) well-beings.

With that context in mind, what do you think about these people who are not in this for money is ripped out of their years of achievement and their hard work exploited for money by generative AI companies?

It's not about IP (with whatever expansion you prefer) or laws, but ethics in general.

Substitute comics for any medium. Code, music, painting, illustration, literature, short movies, etc.

CamperBob2•2h ago
(Shrug) If you want things to stay the same, both art and technology are bad career choices.
bayindirh•2h ago
(Huh) What if you are in the field to advance it, and somebody steals your work and claims it as their own?

e.g.: https://news.ycombinator.com/item?id=44460552

tliltocatl•2h ago
I see your point, "AI art" sucks in general and this is ethically sketchy as hell, but AIAK style copying has never been covered by copyright in the first place. Yea, it sucks to be alienated form your works. That's one of the externalites I mentioned in the original comment. But there is simply no remedy there. That's how the reality is.
bayindirh•1h ago
Thanks for your answer, and taking your time for writing it!

Yes, style copying is generally considered legal, but as another commenter posted in a related thread "scale matters".

Maybe this will be reconsidered in the near future as the scale is in a much more different level with Generative AI. While there can be no technological solution to this (since it's a social problem to begin with), maybe public opinion about this issue will evolve over time.

To be crystal clear: I'm not against the tech. I'm against abusing and exploiting people for solely monetary profit.

pxc•2h ago
It's true that intellectual property is a flawed and harmful mechanism for supporting creative work, and it needs to change, but I don't think ensuring a positive outcome is as simple as this. Whether or not such a power struggle between corporate interests benefits the public rather than just some companies will be largely accidental.

I do support intellectual property reform that would be considered radical by some, as I imagine you do. But my highest hopes for this situation are more modest: if AI companies are told that their data must be in the public domain to train against, we will finally have a powerful faction among capitalists with a strong incentive to push back against the copyright monopolists when it comes to the continuous renewal of copyright terms.

If the "path of least resistance" for companies like Google, Microsoft, and Meta becomes enlarging the public domain, we might finally begin to address the stagnation of the public domain, and that could be a good thing.

But I think even such a modest hope as that one is unlikely to be realized. :-\

Der_Einzige•2h ago
Yup.

My response to this whole thread is just “good”

Aaron Swartz is a saint and a martyr.

LtWorf•1h ago
It will undermine it only for the rich owner of AI companies, not for everyone.
Lionga•5h ago
Based on the fact people went to jail for downloading some music or movies, this guy will face a lifetime in prison for 7 million books that he then used for commercial profit right?

Right guys we don't have rules for thee but not for me in the land of the free?

1oooqooq•5h ago
Aaron Swartz rolling
pyman•5h ago
He downloaded millions of academic articles and the government charged him with multiple felonies.

The difference is, Aaron Swartz wasn't planning to build massive datacenters with expensive Nvidia servers all over the world.

mikewarot•5h ago
>the government charged him with multiple felonies.

This was the result of a cruel and zealous overreach by the prosecutor to try to advance her political career. It should never have gone that far.

The failure of MIT to rally in support of Aaron will never be forgiven.

pyman•4h ago
I agree
omnimus•5h ago
It's even worse considering all he downloaded was in public domain so it was much less problematic considering copyright.

Lesson is simple. If you want to break a law make sure it is very profitable because then you can find investors and get away with it. If you play robin hood you will be met with a hammer.

dandanua•5h ago
Same did Meta and probably other big companies. People who praise AGI are very short sighted. It will ruin the world with our current morals and ethics. It's like a nuclear weapon in the hands of barbarians (shit, we have that too, actually).
booleandilemma•5h ago
So if I'm working on an LLM can I just steal millions of copyrighted books? Is that how our farcical justice system works?
famahar•1h ago
Make sure you have a few billion dollars ready so you can pay a few million on the lawsuits. A volcano getting a cup of water poured into it.
marapuru•5h ago
Apparently it's a common business practice. Spotify (even though I can't find any proof) seems to have build their software and business on pirated music. There is some more in this Article [0].

https://torrentfreak.com/spotifys-beta-used-pirate-mp3-files...

Funky quote:

> Rumors that early versions of Spotify used ‘pirate’ MP3s have been floating around the Internet for years. People who had access to the service in the beginning later reported downloading tracks that contained ‘Scene’ labeling, tags, and formats, which are the tell-tale signs that content hadn’t been obtained officially.

motbus3•5h ago
They had a second company (which I don't remember the name) that allowed users to backup and share their music. When they were exposed they dug that as deep as they could
pyman•5h ago
No. There's no credible evidence Spotify had any secret second company that allowed users to back up and share music without authorisation
pyman•5h ago
It was the opposite. Their mission was to combat music piracy by offering a better, legal alternative.

Daniel Ek said: "my mission is to make music accessible and legal to everyone, while ensuring artists and rights holders got paid"

Also, the Swedish government has zero tolerance for piracy.

pyman•4h ago
I know this might come as a shock to those living in San Francisco, but things are different in other parts of the world, like Uruguay, Sweden and the rest of Europe. From what I’ve read, the European committee actually cares about enforcing the law.
eviks•33m ago
Mission is just words, they can mean the opposite of deeds, but they can't be the opposite, they live in different realms.
KoolKat23•5h ago
There's plenty of startups gone legitimate.

Society underestimates the chasm that exists between an idea and raising sufficient capital to act on those ideas.

Plenty of people have ideas.

We only really see those that successfully cross it.

Small things EULA breaches, consumer licenses being used commercially for example.

pyman•4h ago
There's no credible evidence Spotify built their company and business on pirated music.

This is a narrative that gets passed around in certain circles to justify stealing content.

YPPH•4h ago
"Stealing" isn't an apt term here. Stealing a thing permanently deprives the owner of the thing. What you're describing is copyright infringement, not stealing.

In this context, stealing is often used as a pejorative term to make piracy sound worse than it is. Except for mass distribution, piracy is often regarded as a civil wrong, and not a crime.

pyman•4h ago
Pirating a book and selling it on claude.ai is stealing, both legally and morally.

Pirating 7 million books, remixing their content, and using that to make money on Claude.ai is like counterfeiting 7 million branded products and selling them on your Shopify website. The original creators don't get payment, and someone's profiting off their work.

Try doing that yourself and you'd get a knock on the door real quick.

KoolKat23•4h ago
Properly remixing the content so that it can be considered distinct would be fair use. You can't copyright a style, concept or idea.

Also mostly this would be a civil lawsuit for "damages".

pyman•4h ago
It might be legal in the US, but not in the rest of the world.

The trial is scheduled for December 2025. That’s when a jury will decide how much Anthropic owes for copying and storing over seven million pirated books

ungreased0675•1h ago
There seems to be an unwritten rule for VC-backed tech companies, that if a law is broken at massive scale and very quickly, it’s ok. It’s the fait accompli strategy many of the large tech companies used to get where they are.

Don’t have legal access to training data? Simply steal it, but move fast enough to keep ahead of the law. By the time lawsuits hit the company is worth billions and the product is embedded in everyday life.

KoolKat23•4h ago
Best/most succinct explanation I've seen to date.
lmm•2h ago
> There's no credible evidence Spotify built their company and business on pirated music.

That's a statement carefully crafted to be impossible to disprove. Of course they shipped pirated music (I've seen the files). Of course anyone paying attention knew. Nothing in the music industry was "clean" in those days. But, sure, no credible evidence because any evidence anyone shows you you'll decide is not credible. It's not in anyone's interests to say anything and none of it matters.

hinterlands•1h ago
The problem is that these "small things" are not necessarily small if you're an individual.

If you're an individual pirating software or media, then from the rights owners' perspective, the most rational thing to do is to make an example of you. It doesn't happen everyday, but it does happen and it can destroy lives.

If you're a corporation doing the same, the calculation is different. If you're small but growing, future revenues are worth more than the money that can be extracted out of you right now, so you might get a legal nastygram with an offer of a reasonable payment to bring you into compliance. And if you're already big enough to be scary, litigation might be just too expensive to the other side even if you answer the letter with "lol, get lost".

Even in the worst case - if Anthropic loses and the company is fined or even shuttered (unlikely) - the people who participated in it are not going to be personally liable and they've in all likelihood already profited immensely.

dathinab•1h ago
but it's not some small things

but systematic wide spread big things and often many of them, giving US giant a unfair combative advantage

and don't think if you are a EU company you can do the same in the US, nop nop

but naturally the US insist that US companies can do that in the EU and complain every time a US company is fined for not complying for EU law

Barrin92•14m ago
>Society underestimates the chasm that exists between an idea and raising sufficient capital to act on those ideas.

The AI sector, famously known for its inability to raise funding. Anthropic has in the last four years raised 17 billion dollars

pjc50•4h ago
"recording obtained unofficially" and "doesn't have rights to the recording" are separate things. So they could well have got a license to stream a publisher's music but that didn't come with an actual copy of some/all of the music.
techjamie•3h ago
Crunchyroll was originally an anime piracy site that went legit and started actually licensing content later. They started in mid-2006, got VC funding in 2008, then made their first licensing deal in 2009.

https://www.forbes.com/2009/08/04/online-anime-video-technol...

https://venturebeat.com/business/crunchyroll-for-pirated-ani...

haiku2077•1h ago
Good Old Games started out with the founders selling pirated games on disc at local markets.
Cyph0n•1h ago
Yep, they were huge too - virtually anyone who watched free anime back then would have known about them.

My theory is that once they saw how much traffic they were getting, they realized how big of a market (subbed/dubbed) anime was.

Shank•28m ago
And now Crunchyroll is owned by (through a lot of companies, like Aniplex of America, Aniplex, A1 Pictures) Sony, who produces a large amount of anime!
dathinab•1h ago
not just Spotify pretty much any (most?) current tech giant was build by

- riding a wave of change

- not caring too much about legal constraints (or like they would say now "distrupting" the market, which very very often means doing illigal shit which beings them far more money then any penalties they will ever face from it)

- or caring about ethics too much

- and for recent years (starting with Amazone) a lot of technically illegal financing (technically undercutting competitors prices long term based on money from else where (e.g. investors) is unfair competitive advantage (theoretically) clearly not allowed by anti monopoly laws. And before you often still had other monopoly issues (e.g. see wintel)

So yes not systematic not complying with law to get unfair competitive advantage knowing that many of the laws are on the larger picture toothless when applied to huge companies is bread and butter work of US tech giants

benced•3m ago
As you point out, they mostly did this before they were large companies (where the public choice questions are less problematic). Seems like the breaking of these laws was good for everybody.
Workaccount2•1h ago
The common meme is that megacorps are shamelessly criminalistic organizations that get away with doing anything they can to maximize profits, while true in some regard, totally pales in comparison to the illegal things small businesses and start-ups do.
reaperducer•27m ago
Apparently it's a common business practice.

It's not a common business practice. That's why it's considered newsworthy.

People on the internet have forgotten that the news doesn't report everyday, normal, common things, or it would be nothing but a listing of people mowing their lawns or applying for business loans. The reason something is in the news is because it is unusual or remarkable.

"I saw it online, so it must happen all the time" is a dopy lack of logic that infects society.

lysace•9m ago
You are missing the point. Spotify had permission from the copyright holders and/or their national proxies to use those songs in a limited beta in Sweden. They didn't have access to clean audio data directly from the record companies, so in many cases they used pirated rips instead.

What you really should be asking is whether they infringed on the copyrights of the rippers. /s

pembrook•3m ago
It wasn’t just the content being pirated, but the early Spotify UI was actually a 1:1 copy of Limewire.
NoMoreNicksLeft•2m ago
This isn't as meaningful as it sounds. Nintendo was apparently using scene roms for one of the official emulators on Wii (I think?). Spotify might have received legally-obtained mp3s from the record companies that were originally pulled from Napster or whatever, because the people who work for record companies are lazy hypocrites.
motbus3•5h ago
It is shocking how courts have being ruling towards the benefits of ai companies despite the obvious problem of allowing automatic plagiarism
jobs_throwaway•1h ago
Information wants to be free
kristofferR•34m ago
Not really, plagiarism is not a legal concept.
Kim_Bruning•4h ago
actual title:

"Anthropic cut up millions of used books to train Claude — and downloaded over 7 million pirated ones too, a judge said."

A not-so-subtle difference.

That said, in a sane world, they shouldn't have needed to cut up all those used books yet again when there's obviously already an existing file that does all the work.

greenavocado•3h ago
Should have listened to those NordVPN ads on YouTube
sidewndr46•2h ago
So using the standard industry metrics for calculating the financial impact of piracy, this would equate to something like trillions of damages to the book publishing industry?
2OEH8eoCRo0•2h ago
I've begun to wonder if this is why some large torrent sites haven't been taken down. They are essentially able to crowdsource all the work. There are some users who spend ungodly amounts of time and money on these sites that I suspect are rich industry benefactors.
neonate•1h ago
https://archive.md/YLyPg
bgwalter•1h ago
Here is how individuals are treated for massive copyright infringement:

https://investors.autodesk.com/news-releases/news-release-de...

piker•1h ago
I thought you'd go with this: https://en.wikipedia.org/wiki/United_States_v._Swartz
dialup_sounds•57m ago
Swartz wasn't charged with copyright infringement.
natch•31m ago
*technically
chourobin•1h ago
copyright is not the same as piracy
asadotzler•59m ago
piracy isn't a thing, except on the high seas. what you're thinking about is copyright violation.
downrightmike•53m ago
Yup, piracy sounds better than copyright violation.

“Piracy” is mostly a rhetorical term in the context of copyright. Legally, it’s still called infringement or unauthorized copying. But industries and lobbying groups (e.g., RIAA, MPAA) have favored “piracy” for its emotional weight.

collingreen•46m ago
Emotional weight or because it's intentionally misleading.
admissionsguy•6m ago
Does piracy have negative connotations? I thought everyone thought pirates were cool
achierius•54m ago
Can you explain why? What makes them categorically different or at the very least why is "piracy" quantitatively worse than 'just' copyright violation?
arrosenberg•43m ago
Piracy is theft - you have taken something and deprived the original owner of it.

Copyright infringement is unauthorized reproduction - you have made a copy of something, but you have not deprived the original owner of it. At most, you denied them revenue although generally less than the offended party claims, since not all instances of copying would have otherwise resulted in a sale.

charcircuit•37m ago
Saying that piracy isn't copyright violation is an RMS talking point. It's not worth trying to ask why because the answer will be RMS said so and will not be backed by the common usage of the word.
buzzerbetrayed•30m ago
You legitimately have it completely backwards. The word "piracy" was coopted to put a more severe spin on copyright violation. As a result, it became "the common usage of the word". But that was by design. And it's worth pushing back on.
abeppu•24m ago
Maybe the most memorable version of the response is this the "Copying is not Theft" song. https://www.youtube.com/watch?v=IeTybKL1pM4
NoMoreNicksLeft•7m ago
Asked unironically: "What's worse, hijacking ships at sea and holding their crews hostage for ransom on threat of death, or downloading a song off the internet?" ...
nh23423fefe•40m ago
What point are you making? 20 years ago, someone sold pirated copies of software (wheres the transformation here) and that's the same as using books in a training set? Judge already said reading isnt infringement.

This is reaching at best.

JimDabell•39m ago
> illegally copying and selling pirated software

This is very different to what Anthropic did. Nobody was buying copies of books from Anthropic instead of the copyright holder.

farceSpherule•2m ago
Peterson was copying and selling pirated software.

Come up with a better comparison.

dathinab•1h ago
as far as I understand while training on books is clearly not fair use (as the result will likely hurt the lively hood of authors, especially not "best of the best" authors).

as long as you buy the book it still should be legal, that is if you actually buy the book and not a "read only" eBook

but the 7_000_000 pirated books are a huge issue, and one from which we have a lot of reason to believe isn't just specific to Anthropic

asadotzler•50m ago
Buying a copy of a book does not give you license to take the exact content of that book, repackage it as a web service, and sell it to millions of others. That's called theft.
russell_h•1h ago
The title is clearly meant to generate outrage, but what is wrong with cutting up a book that you own?
timewizard•1h ago
It's destroying something with value for no sane reason. Wasteful and sociopathic.

EDIT: Ah, the old, Hacker News pretend good faith question. Why bother answering you people? You're not interested in anything other than your existing point of view. The least Hacker thing about this place.

jobs_throwaway•1h ago
poverty mindset. We can make more books, and now these copies contribute to a corpus of knowledge that far more people benefit from
timewizard•1h ago
People who pay Anthropic you mean. There is no benefit. And only the owner can make more books.

Fake altruistic mindset. Super sociopathic.

justinrubek•1h ago
Wasteful mindset. They don't need the books, they need the data. They should never have been printed if they were going to he destroyed.
nickpsecurity•1h ago
Buying, scanning, and discarding was in my proposal to train under copyright restrictions.

You are often allowed to nake a digital copy of a physical work you bought. There are tons of used, physical works thay would be good for training LLM's. They'd also be good for training OCR which could do many things, including improve book scanning for training.

This could be reduced to a single act of book destruction per copyrighted work or made unnecessary if copyright law allowed us to share others' works digitally with their licensed customers. Ex: people who own a physical copy or a license to one. Obviously, the implementation could get complex but we wouldn't have to destroy books very often.

asadotzler•50m ago
You are allowed to make a digital copy FOR YOUR OWN USE. You are not allowed to make a billion digital copies and sell those, that's called theft.
NHQ•1h ago
The farce of treating a corporation as an individual precludes common sense legal procedure to investigate people who are responsible for criminal action taken by the company. Its obviously premeditated and in all ways an illicit act knowingly perpetrated by persons. The only discourse should be about upending this penthouse legalism.
NHQ•1h ago
The irony is that actually litigating copyright law would lead to the repeal of said copyright law. And so in all cases of backwaters laws that are used to "protect interests" of "corporations" yet criminalize petty individual cases.

This of course cannot be allowed to happen, so the the legal system is just a limbo, a bar which regular individuals must strain to pass under but that corporations regularly overstep.

outside1234•1h ago
So if you incorporate you can do whatever you want without criminal charges?
trinsic2•1h ago
I'm not seeing how this is fair use in either case.

Someone correct me if I am wrong but aren't these works being digitized and transformed in a way to make a profit off of the information that is included in these works?

It would be one thing for an individual to make person use of one or more books, but you got to have some special blindness not to see that a for-profit company's use of this information to improve a for-profit model is clearly going against what copyright stands for.

jimbob21•53m ago
They clearly were being digitized, but I think its a more philosophical discussion that we're only banging our heads against for the first time to say whether or not it is fair use.

Simply, if the models can think then it is no different than a person reading many books and building something new from their learnings. Digitization is just memory. If the models cannot think then it is meaningless digital regurgitation and plagiarism, not to mention breach of copyright.

The quotes "consistent with copyright's purpose in enabling creativity and fostering scientific progress." and "Like any reader aspiring to be a writer" say, from what I can tell, that the judge has legally ruled the model can think as a human does, and therefore has the legal protections afforded to "creatives."

palmotea•42m ago
> Simply, if the models can think then it is no different than a person reading many books and building something new from their learnings.

No, that's fallacious. Using anthropomorphic words to describe a machine does not give it the same kinds of rights and affordances we give real people.

jimbob21•33m ago
Actually, it does, at least for this case. The judge just said so.
wrs•50m ago
Copyright is not on “information”, It’s on the tangible expression (i.e., the actual words). “Transformative use” is a defense in copyright infringement.
kristofferR•47m ago
What do you think fair use is? The whole point of the fair use clauses is that if you transform copyrighted works enough you don't have to pay the original copyright holder.
skybrian•28m ago
Copyright is largely about distributing copies. It’s not about making something vaguely similar or about referencing copyrighted work to make something vaguely similar.

Although, there’s an exception for fictional characters:

https://en.m.wikipedia.org/wiki/Copyright_protection_for_fic...

ruffrey•54m ago
Two of the top AI companies flouted ethics with regard to training data. In OpenAI's case, the whistleblower probably got whacked for exposing it.

Can anyone make a compelling argument that any of these AI companies have the public's best interest in mind (alignment/superalignment)?

k__•46m ago
So, how should we as a society handle this?

Ensure the models are open source, so everyone can use them, as everyones data is in there?

Close those companies and force them to delete the models, as they used copyright material?

carlosjobim•42m ago
If ingesting books into an AI makes Anthropic criminals, then Google et al are also criminals alike for making search indexes of the Internet. Anything published online is equally copyrighted.
kristofferR•36m ago
Yeah, we can all agree that ingesting books is fair use and transformative, but you gotta own what you ingest, you can't just pirate it.

I can read 100 books and write a book based on the inspiration I got from the 100 books without any issue. However, if I pirate the 100 books I've still committed copyright infringement despite my new book being fully legal/fair use.

carlosjobim•12m ago
I disagree that it has anything to do with copyright. It is at most theft. If I steal a bunch of books from the library, I haven't committed any breach of copyright.
1970-01-01•29m ago
The buried lede here is Antrhopic will need to attempt to explain to a judge how it is impossible to de-train 7M of books from their models.
dehrmann•21m ago
The important parts:

> Alsup ruled that Anthropic's use of copyrighted books to train its AI models was "exceedingly transformative" and qualified as fair use

> "All Anthropic did was replace the print copies it had purchased for its central library with more convenient space-saving and searchable digital copies for its central library — without adding new copies, creating new works, or redistributing existing copies"

It was always somewhat obvious that pirating a library would be copyright infringement. The interesting findings here are that scanning and digitizing a library for internal use is OK, and using it to train models is fair use.

jpalawaga•11m ago
I don't think that's new. google set precedent for that more than a decade ago. you're allowed to transform a book to digital.