frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Copyright winter is coming (to Wikipedia?)

https://authorsalliance.substack.com/p/copyright-winter-is-coming-to-wikipedia
50•the-mitr•1h ago

Comments

area51org•55m ago
One fundamental difference: Wikipedia is not a for-profit corporation. OpenAI is. That probably matters.
throwaway-0001•44m ago
Non for profit does not equal to no salaries for executives- they still have highly inflated salaries.

Non for profit just means there is no dividends to owners but they can very well get huge salaries. So actually non for profit is a very bad name.

Should be called non dividend company.

cwillu•36m ago
It should be called exactly what it is called, because that is the correct term for benefits accrued to an owner.
charcircuit•34m ago
OpenAI is a nonprofit.
CGamesPlay•28m ago
Non-profit OpenAI ("OpenAI Foundation") holds a 26% interest in for-profit OpenAI ("OpenAI Group PBC").

https://www.cnbc.com/2025/10/28/open-ai-for-profit-microsoft...

nightshift1•25m ago
not since oct28
mmooss•31m ago
I've never heard that non-profits can violate intellectual property laws. Otherwise, that might give advantages to Sci-hub, shadow libraries, etc.
o11c•26m ago
Another fundamental difference: OpenAI explicitly markets their tool as a replacement for the copyrighted material it was trained on. This is most explicit for image generation, but applies to text as well.

As a reminder, the 4 factors of "fair use" in the United States:

1. the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;

2. the nature of the copyrighted work;

3. the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and

4. the effect of the use upon the potential market for or value of the copyrighted work.

chupchap•51m ago
From what I understood, the case against OpenAI wasn't about the summarisation. It was the fact that the AI was trained on copyrighted work. In case of Wikipedia, the assumption is that someone purchased the book, read it, and then summarised it.
throwaway-0001•46m ago
I think we have no evidence someone bought the book and summarized. And what if an ai bought the book and summarized, is it fine now?
jen729w•15m ago
Yes. Anthropic won that one.

https://authorsguild.org/advocacy/artificial-intelligence/wh...

colechristensen•44m ago
There are separate issues.

One is a large volume of pirated content used to train models.

Another is models reproducing copyrighted materials when given prompts.

In other words there's the input issue and the output issue and those two issues are separate.

cameldrv•25m ago
They’re sort of separate. In a sense you could say that the ChatGPT model is a lossily compressed version of its training corpus. We acknowledge that a jpeg of a copyrighted image is a violation. If the model can recite Harry Potter word for word, even imperfectly, this is evidence that the model itself is an encoding of the book (among other things).

You hear people saying that a trained model can’t be a violation because humans can recite poetry, etc, but a transformer model is not human, and very philosophically and economically importantly, human brains can’t be copied and scaled.

duskwuff•20m ago
> You hear people saying that a trained model can’t be a violation because humans can recite poetry, etc

Also worth noting that, if a person performs a copyrighted work from memory - like a poem, a play, or a piece of music - that's a copyright violation. "I didn't copy it, I memorized it" isn't the get-out-of-jail-free card some people think it is.

tavavex•17m ago
They're very separate in terms of what seems to have happened in this case. This lawsuit isn't about memory or LLMs being archival/compression software (imho, a very far reach) or anything like that. The plaintiffs took a bit of text that was generated by ChatGPT and accused OpenAI of violating their IP rights, using the output as proof. As far as I understand, the method at which ChatGPT arrived to the output or how Game of Thrones is "stored" within it is irrelevant, the authors allege that the output text itself is infringing regardless of circumstance and therefore OpenAI should pay up. If it's eventually found that the short summary is indeed infringing on the copyright of the full work, there is absolutely nothing preventing the authors (or someone else who could later refer to this case) from suing someone else who wrote a similar summary, with or without the use of AI.
yorwba•15m ago
A jpeg of a copyrighted image can be copyright infringement, but isn't necessarily. A trained model can be copyright infringement, but isn't necessarily. A human reciting poetry can be copyright infringement, but isn't necessarily.

The means of reproduction are immaterial; what matters is whether a specific use is permitted or not. That a reproduction of a work is found to be infringing in one context doesn't mean it is always infringing in all contexts; conversely, that a reproduction is considered fair use doesn't mean all uses of that reproduction will be considered fair.

petermcneeley•51m ago
The implication here of course that if we allow AI to be taken down by copyright then it could also take down Wikipedia. I am not even sure this is close to being true despite the article trying to suggest otherwise.

Perhaps a section on what the differences are might be helpful. For example what role does style play in the summary. I dont think that the summary of wiki is in the style of George R Martin.

tavavex•27m ago
I'm confused. There's an entire paragraph in the article where the author compares the two summaries and finds that they differ only in their structuring. I can't find any part of the article saying that the LLM summary was written "in the style of George R.R. Martin", as far as I understand both summaries are conceptually very similar. That's the main problem. If the scope of substantial similarity to a novel is pushed down from hundreds of pages of writing to a summary that's a couple paragraphs long, then all these summaries are in potential danger. To my knowledge there's no criteria that lets you only find LLM summaries infringing without leaving an opening for the lawyers to expand the reach to target all summaries of copyrighted content.
petermcneeley•15m ago
Even if true wiki would escape via fair use and AI would not. It is possible that the laws and judgements are inconsistent nonsense but assuming they are not the fact that wiki has been around for decades suggests at least one key difference.
noduerme•29m ago
>> Every year, I ask students in my copyright class why the children’s versions of classic novels in Colting were found to be infringing but a Wikipedia summary of the plots of those same books probably wouldn’t be.

Not a lawyer, but the answer seems to obviously be that one is a commercial reproduction and the other is not. Seems like it would be a tougher questiom if the synopsis was in a set of Encyclopedia Britannica or something.

AI is clearly reproducing work for commercial purposes... ie reselling it in a new format. LLMs are compression engines. If I compress a movie into another format and sell DVDs of it, that's a pretty obvious violation of copyright law. If I publish every 24th frame of a movie in an illustrated book, that's a clear violation, even if I blur things or change the color scheme.

If I describe to someone, for free, what happened in a movie, I don't see how that's a violation. The premise here seems wrong.

Something else: Even a single condensation sold for profit only creates one new copyright itself. LLMs wash the material so that they can generate endless new copyrighted material that's derivative of the original. Doesn't that obliterate the idea of any copyright at all?

duskwuff•24m ago
Good guess, but no. The most salient difference in that case is that an abridged children's version of a novel acts as a direct market substitute for the original, whereas a plot summary does not. (A secondary reason is that an abridged edition is likely to represent a much larger portion of the original work than would appear in a summary.)

For further reading, see: https://en.wikipedia.org/wiki/Fair_use#U.S._fair_use_factors

cm2012•27m ago
This is my favorite article on HN since the one on solar panels in Africa. Love to see a subject matter expert making a case at the bleeding edge of their field.
wzdd•27m ago
Entertaining that the article about copyright-infringing similarity of AI-generated summaries is illustrated with a picture of an animated skeleton labelled "White Walker", which is neither what White Walkers are nor what they look like.
jrflowers•26m ago
I like that the author saw a cartoon of a skeleton looking at the back of a tablet and thought “this is good enough to describe as a white walker reading Wikipedia”
dev1ycan•20m ago
"AI" keeps destroying free sources of information.

First it was library genesis and z-lib when meta torrented 70TB of books and then pulled off the ladder, recently it was Anna's archive and how they are coming for it (google and others), weird behaviors with some other torrent sites, now also Wikipedia is being used as a tool to defend LLMs breaking any semblance of copyright "law" unpunished.

All these actions will end up with very bad repercusions once the bubble bursts, there will be a lot of explaining to do.

varenc•19m ago
The ruling never said summaries are infringing. It just said the authors’ claims about some AI outputs were "plausible" enough to get past a motion to dismiss, which is basically the lowest hurdle. The judge isn’t deciding what actually counts as infringement, just that the case can move forward. IMHO it’s reading more into the opinion than what the judge actually decided.
tavavex•14m ago
The author already fully addressed this in the article. They just think that even the fact that this was allowed to move forward is a worrying sign:

> Judge Stein’s order doesn’t resolve the authors’ claims, not by a long shot. And he was careful to point out that he was only considering the plausibility of the infringement allegation and not any potential fair use defenses. Nonetheless, I think this is a troubling decision that sets the bar on substantial similarity far too low.

bawolff•17m ago
Honestly, i always thought this was how it always worked. A summary is by neccesisty a derrivative of the thing being summarized, but it is also very vert clearly fair use. Its transformational, its for an educational purpose, it contains only a tiny portion of the original work and it does not compete with the original work. I can't imagine anything more fair use then that.

Personally i'm not worried.

TheDong•12m ago
To me the key difference is that Wikipedia summaries are written by a human, and so creativity imbues them with new copyright.

OpenAI outputs are an algorithm compressing text.

A jpeg thumbnail of an image is smaller but copyright-wise identical.

An OpenAI summary is a mechanically generated smaller version, so new creative copyright does not have a chance to enter in

areoform•9m ago
The push to expand repressive copyright laws because machines can learn from human produced text, code and art is going to bite artists in the long run.

Copyright is already extremely restrictive and (paired with commercial pressures) has resulted in a stilted creative commons.

Expanding copyright even more so that text / art that looks stylistically similar to another work is counted as infringing will, in the long run, give Disney's lawyers the power to punish folks for making content that even looks anything like Disney's many, many, many IP assets.

Even though Steamboat Willie has entered the public domain, Disney has been going after folks using the IP, https://mickeyblog.com/2025/07/17/disney-is-suing-a-hong-kon... / https://mickeyblog.com/2025/07/17/disney-is-suing-a-hong-kon...

The "infringement" in this case was a diamond encrusted Steamboat Willie style Mickey pendant.

Questionable taste aside, I think it's good for society if people are able to make diamond encrusted miniature sculptures of characters from a 1928 movie in 2025. But Disney clearly disagrees.

Disney (and other giant corps) will use every tool in their belt to go after anyone who comes close to their money makers. There has been a long history of tension between artists and media corps. But that's water under the bridge now. AI art is apparently so bad that artists are willing to hand them the keys to their castle.

CamperBob2•5m ago
It's a moot point, at least as far as AI is concerned, because nobody in China gives a mouse's behind about any of this.

Nor should they.

CRISPR Gene-Editing Therapy Safely Lowers Cholesterol and Triglycerides

https://newsroom.clevelandclinic.org/2025/11/08/cleveland-clinic-first-in-human-trial-of-crispr-g...
1•gmays•2m ago•0 comments

Apart from coding agents, what other category of agents are seeing scale?

1•marshall300791•6m ago•0 comments

Ask HN: Recent CSS change? Some titles are now cropped in the feed on mobile

1•jasonjmcghee•7m ago•0 comments

Paracetamol, Alcohol and the Liver (2000)

https://pmc.ncbi.nlm.nih.gov/articles/PMC2014937/
1•georgecmu•8m ago•0 comments

New Proofs Probe Soap-Film Singularities

https://www.quantamagazine.org/new-proofs-probe-soap-film-singularities-20251112/
1•nsoonhui•9m ago•0 comments

Building open AI to cure or prevent all disease by 2110

https://www.latent.space/p/biohub
1•gmays•13m ago•0 comments

Putin Is Turning Eighth-Grade Classrooms into Army Training Grounds

https://www.wsj.com/world/russia/putin-is-turning-eighth-grade-classrooms-into-army-training-grou...
1•pinewurst•14m ago•0 comments

Gem: Modded Gemini is a great enterprise search tool/debug asst?

https://gemini.google.com/gems/create
1•garryssaunaboy•16m ago•1 comments

I Used AI to Write My Last Blog Post and HN Flagged It

https://meysam.io/blog/hn-viral-flagged-submission/
2•meysamazad•17m ago•1 comments

Binary Prediction and Artificial Intelligence = Automation for Businesses

1•ZekeV•17m ago•0 comments

Instant chat roomz for your friends – speakz.chat

https://speakz.chat/
1•nickisyourfan•17m ago•0 comments

TypeScript, Python, and the AI feedback loop changing software development

https://github.blog/news-insights/octoverse/typescript-python-and-the-ai-feedback-loop-changing-s...
1•meysamazad•17m ago•0 comments

Show HN: Beatdelay.co – Stop procrastinating on urgent tasks

https://beatdelay.co
1•ivanramos•18m ago•4 comments

Fast and Accurate Long Text Generation with Few-Step Diffusion Language Models

https://github.com/apple/ml-fs-dfm
1•gok•19m ago•0 comments

The Anatomy of the Least Squares Method, Part Two

https://thepalindrome.org/p/the-anatomy-of-the-least-squares-ab5
1•tzury•23m ago•0 comments

BC's Tales of the Pacific ǀ The awful tuna tragedy

https://mvariety.com/editorials/columnists-bcs-tales-of-the-pacific-the-awful-tuna-tragedy-k2xmj0...
1•sipofwater•23m ago•1 comments

Show HN: A developer mental health tool that encourages daily coding with breaks

https://bitpet.dev/
1•rishabhpoddar•24m ago•0 comments

Show HN: Questmate (SafetyCulture Alternative)

https://www.questmate.com/
1•cedel2k1•26m ago•0 comments

Architectures of the AI Mind: The simple prompt that tells you everything

https://artificiallyintelligentspace.substack.com/p/architectures-of-the-ai-mind
1•datanality•29m ago•0 comments

How to Survive Artificial Intelligence

https://www.thefp.com/p/ai-will-change-what-it-is-to-be-human
1•gmays•30m ago•0 comments

What Happened with Claude Opus 4.1?

https://artificiallyintelligentspace.substack.com/p/heres-to-another-legacy-model
1•datanality•30m ago•0 comments

Sima Mofakham, Chuck Mikell's AI that sees signs of consciousness

https://tbrnewsmedia.com/sbus-sima-mofakham-chuck-mikell-design-ai-that-sees-signs-of-consciousness/
1•Marshferm•34m ago•0 comments

Show HN: Treasury – The personal finance app built for you (public beta)

https://treasury.sh/
2•junead01•35m ago•0 comments

A Search Engine That Finds You Weird Old Books – By Clive Thompson – Debugger

https://debugger.medium.com/a-search-engine-that-finds-you-weird-old-books-3a74fbb5f3d4
1•pkaeding•36m ago•0 comments

Google Wave

https://en.wikipedia.org/wiki/Google_Wave
2•Nition•40m ago•1 comments

Is it too late to contribute to AI? (Andrew Ng)

https://www.deeplearning.ai/the-batch/issue-327/
2•dvrp•44m ago•0 comments

Improve Your Personality (1951)

https://www.youtube.com/watch?v=VvFF9NlRlxQ
1•higgins•46m ago•0 comments

How to Write a Book in Markdown

https://carlalexander.ca/write-book-markdown/
2•nivethan•48m ago•0 comments

Show HN: AI SDK Rust – With Storage

1•saribmah•50m ago•0 comments

AirPods Live Translation Expands to the EU

https://mjtsai.com/blog/2025/11/06/airpods-live-translation-expands-to-the-eu/
1•colinprince•53m ago•0 comments