2. Is it possible to estimate how much of copyrighted material has been used?
2. Is it possible to estimate how much of copyrighted material has been used?
There's no easy answer there, hence New York Times v. OpenAI.
I think sticking a straw in Zlib or AA or LibGen or whatever it is, and drinking until it makes gurgling slurping noises as it hoovers up the dregs at the bottom of the barrel, is far, far removed from “fair use”.
For example, most popular textbooks have at least several pirate copies uploaded to the web. Some of them are even in plain sight and Googleable.
muzani•3mo ago
2. This is harder as a lot of them don't disclose training sets.