frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

But what about my garden leave? (2023)

https://www.ft.com/content/4dbe4c46-647f-4019-b0c7-b8c8a752501c
1•walterbell•8m ago•0 comments

Sholay: Bollywood epic roars back to big screen after 50 years with new ending

https://www.bbc.com/news/articles/cvg8m9z5vv8o
1•sonabinu•11m ago•0 comments

Why Does Every Commercial for A.I. Think You're a Moron?

https://www.nytimes.com/2025/06/25/magazine/ai-commercials-ads-loneliness.html
3•lxm•13m ago•1 comments

p5.strands: Writing Shaders in JavaScript

https://www.davepagurek.com/blog/writing-shaders-in-js/
2•wonger_•17m ago•0 comments

Google DeepMind team up to solve the Navier-Stokes million-dollar problem

https://english.elpais.com/science-tech/2025-06-24/spanish-mathematician-javier-gomez-serrano-and-google-deepmind-team-up-to-solve-the-navier-stokes-million-dollar-problem.html
4•bilsbie•26m ago•0 comments

A real-time index for your codebase: Secure, personal, scalable

https://www.augmentcode.com/blog/a-real-time-index-for-your-codebase-secure-personal-scalable
1•handfuloflight•27m ago•0 comments

Counter Service: How we rewrote it in Rust

https://engineering.grab.com/counter-service-how-we-rewrote-it-in-rust
2•nnx•28m ago•0 comments

Amarok Audio Player replaces Phonon API with GStreamer

https://www.neowin.net/news/amarok-33-beta-2-replaces-phonon-api-with-gstreamer/
1•bundie•29m ago•0 comments

Free online picture splitter and Instagram grid maker

https://aiimagesplitter.com
1•zgm13827•30m ago•0 comments

Ask HN: Seeking Publisher for a Book on AI, Creativity and Human Agency

2•haebom•35m ago•1 comments

Show HN: AI Phone Interviewer – get a call in 30 seconds

1•OlehSavchuk•37m ago•1 comments

Disney+ Application Development Kit (ADK)

https://medium.com/disney-streaming/introducing-the-disney-application-development-kit-adk-ad85ca139073
2•imwally•38m ago•0 comments

AI company wins a copyright infringement lawsuit brought by authors

https://www.npr.org/2025/06/25/nx-s1-5445242/federal-rules-in-ai-companys-favor-in-landmark-copyright-infringement-lawsuit-authors-bartz-graeber-wallace-johnson-anthropic
2•dleslie•46m ago•1 comments

HarmonyOS Next Element Positioning

1•flfljh•52m ago•0 comments

Flutter Performance Tuning on HarmonyOS

1•flfljh•53m ago•0 comments

Hug CSS, how I approach CSS architecture

https://gomakethings.com/hug-css-how-i-approach-css-architecture/
3•Bogdanp•57m ago•0 comments

Refactoring Codebases Through Library Design

https://code-refactor.github.io/
1•PaulHoule•57m ago•0 comments

SSH Tron: Multiplayer Tron in your terminal

http://sshtron.zachlatta.com
1•nnx•59m ago•0 comments

Ask HN: What are alternatives to Glitch for hosting a simple Node/Express app?

1•sebastian_z•1h ago•0 comments

macOS Tahoe Beta Forces Sharing FileVault Key

https://mjtsai.com/blog/2025/06/24/macos-tahoe-beta-forces-sharing-filevault-key/
10•miles•1h ago•1 comments

Global climate was more dynamic and extreme than researchers had imagined

https://www.washingtonpost.com/climate-environment/2024/09/19/earth-temperature-global-warming-planet/
3•bilsbie•1h ago•0 comments

Radar AI Training

https://mjtsai.com/blog/2025/06/25/radar-ai-training/
1•bangonkeyboard•1h ago•0 comments

Pedagogy Unchained

https://learning-with-orin.beehiiv.com/p/pedagogy-unchained
2•BryanHoulton•1h ago•0 comments

Windows 10: News about ESU program – free options for consumers

https://borncity.com/win/2025/06/25/windows-10-news-about-esu-program-free-options-for-consumers/
1•miles•1h ago•0 comments

Ask HN: Can LLMs do batch classification?

1•iknownthing•1h ago•1 comments

3dSen PC v1.0

https://geod.itch.io/3dsenpc/devlog/969781/-3dsen-pc-v10-is-here-a-dream-10-years-in-the-making
1•prossercj•1h ago•0 comments

Stack grows down, but local variables grow up? Let me explain

https://www.gizvault.com/archives/stack-growth-differs-from-locals-growth
3•ricecat•1h ago•0 comments

Democratic Leaders Tried to Crush Zohran Mamdani, Should Have Been Taking Notes

https://www.nytimes.com/2025/06/25/opinion/zohran-mamdani-democratic-party.html
5•handfuloflight•1h ago•0 comments

Show HN: SVG Lined Tile Generator

https://adpreese.github.io/svg-lined-tiles/
1•adpreese•1h ago•0 comments

Show HN: AI Body Type Calculator with Personalized Health Plans (Updated)

https://mybodytype.net/
1•howardV•1h ago•0 comments
Open in hackernews

Anthropic destroyed millions of print books to build its AI models

https://arstechnica.com/ai/2025/06/anthropic-destroyed-millions-of-print-books-to-build-its-ai-models/
15•bayindirh•5h ago

Comments

JohnFen•4h ago
> In the process, the company cut millions of print books from their bindings, scanned them into digital files, and threw away the originals solely for the purpose of training AI

Oh boy. The more I learn about how genAI companies work, the more detestable they appear to be.

ThrowawayR2•4h ago
You got suckered by the clickbait. Destructive scanning (https://en.wikipedia.org/wiki/Book_scanning#Destructive_scan...) isn't unusual for books that are common enough that an individual volume is of no particular value.
bayindirh•4h ago
I mean, they could have gotten e-book versions of the books, or even preprint PDFs.

In an era where people are starting to calculate the environmental impact of the jobs they run on the cloud and start to optimize it, adding that much load on recycling system is not a wise choice, but only a selfish one.

ThrowawayR2•3h ago
I'm sure they would have loved to save the hassle and expense of disassembling physical books. Presumably something legal related or cost related prevented them from going that route.
JohnFen•3h ago
Yes, they did it as a workaround for copyright. TFA explains that aspect.
rpdillon•13m ago
It's not a workaround for copyright. It's to obey copyright. As in: copyright law is the reason they destroyed the books.

Meta didn't have to do any of this. They just used The Pile.

AlotOfReading•3h ago
I strongly suspect that dealing with ebooks on this scale might actually be even more onerous than the physical volumes.

The physical stuff is straightforward. Buy books from bulk sellers, rip off everything and put them into off-the-self rigs for digitization. It's straightforward, directly scalable, can use any book, and your main issue is format shifting, which anthropic successfully argued here. No DRM, you buy exactly the books you need, and every book is processed exactly the same way.

If you try to buy ebooks, you get wrapped up in onerous licensing terms about copying, and how you're able to use them, how long you're able to access them, and so on. Many books won't even be available (or can only be licensed alongside a bunch of others) and you have to deal with DRM you can't strip without creating additional copyright issues.

We've somehow created a world where physical objects are more free than bits.

rpdillon•16m ago
No, they probably couldn't have. eBooks are notoriously DRMed and the DMCA makes it illegal to circumvent an effective copy protection mechanism even if you otherwise have legal access to work. Furthermore, first sale doctrine doesn't apply to any digital files and they can't be obtained legally in bulk.
JohnFen•3h ago
I didn't get suckered by anything. I'm aware of the practice. I find it objectionable. That they did this is just another thing on the growing list of objectionable things that genAI companies seem to enjoy doing.

To be honest, I probably wouldn't have even commented on it if it were the only bad thing these companies do.

rpdillon•18m ago
It was only legal because they did it this way.

> Ultimately, Judge William Alsup ruled that this destructive scanning operation qualified as fair use—but only because Anthropic had legally purchased the books first, destroyed each print copy after scanning, and kept the digital files internally rather than distributing them. The judge compared the process to "conserv[ing] space" through format conversion and found it transformative.

Very laws that the publishing industry has lobbied so heavily to make so strict are the reasons for this behavior.

EA-3167•4h ago
I don't like Anthropic, I think their "marketing through fear" approach to be shitty and frankly I'm over the AI "boom" anyway.

BUT... here's the only line in that whole article that really matters, because this is a headline meant to create an impression that isn't corrected for quite a while.

> The court documents don't indicate that any rare books were destroyed in this process—Anthropic purchased its books in bulk from major retailers

Books are routinely pulped and recycled, they aren't holy, and if they aren't rare then frankly who cares what techniques they use to scan them? The issue is whether or not "AI" learning represents fair use, which the courts so far have ruled that it does.

bayindirh•4h ago
> any rare books were destroyed in this proces

Does it matter? It's waste at the end of the day. Instead they could have bought e-books. Just because we can recycle paper, it doesn't mean we have the luxury to create waste as we see fit, esp. when climate change became this severe.

> which the courts so far have ruled that it does.

Any concrete cases you can cite?

From [0], for example, while the course said that the authors failed to argue their case, the second observation is complete opposite of what you said. Citing the article directly:

    Opinion suggests AI models do generally violate law.
In the same spirit, I think I can safely assume that they violated copyright law, since they earn money by circumventing it, and fair use doesn't like for-profit copying.

[0]: https://news.bloomberglaw.com/litigation/meta-beats-copyrigh...

kirrent•3h ago
TFA is based on the ruling which found that Anthropic training on these books was fair use.
robocat•3h ago
> It's waste at the end of the day

Rubbish.

More likely they are taking a waste stream of books and reusing and possibly even recycling.

Few people want old books, and many people that have books are throwing them out or donating them. I don't think I know anybody under 30 with a bookshelf of books they obviously intend to keep for life. Bookshelves used to be an elite status symbol, now I often see them as image rather than reference (e.g. part off backdrop behind influencer vid).

It is likely they didn't destroy much of value, since they will have minimized their purchasing costs. Modern DRM is not helping.

cma•3h ago
They'd have to agree to special terms that go beyond the normal first sale doctrine. If those terms don't hold up their own terms against training on their model data for foundation models might not hold up, so you can see their perverse incentive to burn books.
JohnFen•3h ago
> Does it matter?

As someone who finds the act objectionable, I actually do think this is an important point. Destroying commodity books in this way is objectionable. Destroying precious books in this way would be abominable.

miohtama•3h ago
Reuters news on the lawsuit

https://news.ycombinator.com/item?id=44375269

shawn_w•3h ago
Getting flashbacks to Vernor Vinge's book Rainbows End, where there's a project to rapidly digitize the collection of the UCSD's Geisel Library by shredding all the books and photographing the fragments of pages and reassembling them via computer programs.

It's set in 2025.

igor47•3h ago
Lol jinx!
igor47•3h ago
Reminds me of "Rainbow's End" by Vinge. There's a machine that's like a giant worm, which slithers down the stacks in a library, vacuuming up all the books. They pass through shredders, and then the shredded remains fly down the body of the worm, which is studded with cameras. The cameras photograph the pieces and then software reconstructs the content of the books based on unique shapes of the shreds, like solving a million simultaneous jigsaw puzzles. The paper is excreted and recycled or burned.
mensetmanusman•1h ago
This reminds me scrolls on Diablo. Soon real books will all disappear to dust as AI stats are improved.