Edit: It seems like they are. Stealing from tens of thousands of artists, big and small, and calling it "preservation" or "archiving" is scummy.
Most people do not because they find it less convenient than paying 20bucks a month or whatever is the current price in 2025 but that doesn't change the reality.
For most people the appeal of Spotify is not the music itself but the playlists that are shared thanks to its ubiquity. This is the reason other services struggle to make a dent even if they have better quality, UI and algos.
Spotify started by disrupting the market using pirated music by the way so you are pretty much endorsing and encouraging piracy when "paying" your favorite artists through Spotify.
What’s actually scummy is Spotify paying artists $1 per 1000 streams.
Buy CDs. Use Bandcamp.
So let the rights holders make the decision? They would never. Music rights exist for them to extract profit above all else. They don't care about preserving culture or legacy. Which is why it's important that somebody does.
My spotify wrapped says I listened for 50,000 minutes this year. Assuming 2 minutes per song, that's 25,000 streams. I paid them $110, aka $0.004/stream. Assuming I'm a typical user, they obviously could not afford to pay any more than that per stream.
I googled "spotify pay per listen" and the first result is a reddit comment saying "The average payout on Spotify is only $0.004 per stream." The google AI overview says "Spotify [..] pays artists a fraction of a cent, typically $0.003 to $0.005 per stream". So I'll assume it's something in that ballpark.
So it seems like Spotify's payouts are completely reasonable, given their pricing. Is my logic wrong somewhere?
The value of Spotify is the convenience, and this collection does not change that in any way. Your argument would apply if someone were to make a Spotify clone with the same UX using this data.
This doesn't apply to dematerialized content: the original copy still exists. The only negative impact occurs if someone decides to actually use the pirated copy in place of buying a licensed one.
The mere existence of this new pirate copy being around doesn't automatically imply that, especially if other, more convenient sources are available.
Full disclosure, I am a career musician AND have been known to pirate material. That said, I think this is a valuable archive to build. There are a lot of recordings that will not endure without some kind of archiving. So while it's not a perfect solution, I do think it has an important role to play in preservation for future generations.
Perhaps it's best to have a light barrier to entry. Something like "Yes, you can listen to these records, but it should be in the spirit of requesting the material for review, and not just as a no-pay alternative to listening on Spotify." Give it just enough friction where people would rather pay the $12/month to use a streaming service.
Also, it's not like streaming services are a lucrative source of income for most artists. I expect the small amount of revenue lost to listeners of Anna's Archive are just (fractions of) a penny in the bucket of any income that a serious artist would stand to make.
It is technically not. Stealing means you have a thing, I steal it, now I have the thing and you do not. You can’t steal a copyright (aside from something like breaking into your stuff and stealing the proof that you hold the copyright), and then a song is downloaded the original copyright holder still have copy.
Calling piracy theft was MPAA/RIAA propaganda. Now people say that piracy is theft without ever even questioning it, so it was quite successful.
It used to be more mixed, but today, piracy is often the only option to ”own” any media at all.
> A while ago, we discovered a way to scrape Spotify at scale.
They wont and shouldn’t divulge the details, but I imagine that would be a fun read!
"Their buisness model is based on copyright infringement"
Well, where to complain that Anna's Archive ain't a buisness?
I definitely was not aware Spotify DRM had been cracked to enable downloading at scale like this.
The thing is, this doesn't even seem particularly useful for average consumers/listeners, since Spotify itself is so convenient, and trying to locate individual tracks in massive torrent files of presumably 10,000's of tracks each sounds horrible.
But this does seem like it will be a godsend for researchers working on things like music classification and generation. The only thing is, you can't really publicly admit exactly what dataset you trained/tested on...?
Definitely wondering if this was in response to desire from AI researchers/companies who wanted this stuff. Or if the major record labels already license their entire catalogs for training purposes cheaply enough, so this really is just solely intended as a preservation effort?
I wouldn’t be so sure. There are already tools to automatically locate and stream pirated TV and movie content automatic and on demand. They’re so common that I had non-technical family members bragging at Thanksgiving about how they bought at box at their local Best Buy that has an app which plays any movie or TV show they want on demand without paying anything. They didn’t understand what was happening, but they said it worked great.
> Definitely wondering if this was in response to desire from AI researchers/companies who wanted this stuff.
The Anna’s archive group is ideologically motivated. They’re definitely not doing this for AI companies.
Very interesting, thank you. So using this for AI will just be a side effect.
And good point -- yup, can now definitely imagine apps building an interface to search and download. I guess I just wonder how seeding and bandwidth would work for the long tail of tracks rarely accessed, if people are only ever downloading tiny chunks.
Anyone who wants to listen to unlimited free music from a vast catalog with a nice interface can use YouTube/Google Music. If they don't like the ads they can get an ad blocker. Downloading to your own machine works well too.
They have a page directly addressed to AI companies, offering them "enterprise-level" access to their complete archives in exchange for tens of thousands of dollars. AI may not be their original/primary motivation but they are clearly on board with it.
Didn't Meta already publicly admit they trained their current models on pirated content? They're too big to fail. I look forward to my music Slop.
It's probably going to make the AI music generation problem worse anyway...
Do they have DRM at all? Youtube and Pandora don't.
Additionally there was a lot of discourse about music and a lot of curated discovery mechanisms I sorely miss to this day. An algorithm is no replacement for the amount of time and care people put into the web of similar artists, playlists of recommendations and reviews. Despite it being piracy, music consumption through it felt more purposeful. It's introduced me to some of my all time favourite artists, which I've seen live and own records and merchandise of.
One interesting way of discovering artists is finding an artist that I already like on a compilation CD, and then seeing what else is on the CD.
Anecdotally, I know a few vocalists that sound great in these keys and use them as a starting point
For the major scale, there are 7 notes in the scale and only 5 black keys; you also need to skip ti, the 7th note.
For the minor scale ("C#m"), it's worse; only four of the five black keys are part of that scale.
And I would have thought that something intended to be played only on the black keys would be described as using a pentatonic scale anyway?
https://www.scribd.com/document/56651812/kreitz-spotify-kth1...
A distributed ripping project to do that would be a fine thing.
The data will be released in different stages on our Torrents page:
[X] Metadata (Dec 2025)
[ ] Music files (releasing in order of popularity)
[ ] Additional file metadata (torrent paths and checksums)
[ ] Album art
[ ] .zstdpatch files (to reconstruct original files before we added embedded metadata)
Another extremely annoying effect is, being 40+, they only suggest music for my age. In “New” and “Trending”, I see Muse and Coldplay! I should make myself a fake ID just to discover new music, but that gets creepy very fast.
There is contemporary lost media being created every day because of how we distribute things now. I think in some cases, the intent of the publisher was to literally destroy every copy of the information. I understand the legal arguments for this, but from a spiritual perspective, this is one of the most offensive things I can imagine. Intentionally destroying all copies of a creative work is simply evil. I don't care how you frame it.
Making media effectively lost is not much different in my mind. Is it available if it's sitting on a tape in an iron mountain bunker that no one will ever look at again?
But, more importantly, I cannot even say "good for you", because I don't actually think it is good for Anna's Archive. I wouldn't touch that thing, if I was them. Do we even have any solid alternatives for books, if Anna's Archive gets shot down, by the way? Don't recommend Amazon, please.
Now imagine a dedicated music client that will download and stream (and share, because we are polite) only the needed files :)
lelouch9099•2h ago
phainopepla2•2h ago
basisword•1h ago
nemomarx•1h ago
poly2it•1h ago
jopicornell•1h ago
People that gives money to artists are the ones going to concerts and buying music directly to artists. Spotify gives cents to artists, incetivizing awful behaviour (AI music, aggressive marketing, low effort art...).
conception•31m ago
Aurornis•1h ago
toomuchtodo•1h ago
ronsor•1h ago
luke-stanley•32m ago