BBC threatens AI firm with legal action over unauthorised content use

https://www.bbc.co.uk/news/articles/cy7ndgylzzmo

52•ColinWright•5h ago

Comments

esskay•4h ago

> In a statement, Perplexity said: "The BBC's claims are just one more part of the overwhelming evidence that the BBC will do anything to preserve Google's illegal monopoly."

That's got to be the most delusional response they could've given. It's not BBC or any other news publishers job to preserve Google's monopoly. The comparison would only even work if Google was replacing a link to a BBC article in the search results with a direct copy of said article on the Google search results page.

oneeyedpigeon•4h ago

I'd love to see some—any—of this "overwhelming evidence". I suspect it does not exist. I'd also love to ask Perplexity why they think the BBC would have any kind of bias toward Google, it just doesn't make any sense.

randall•3h ago

this is the most non sequitur press statement ever.

josefritzishere•3h ago

Good. I hope BBC gets a historically large judgement and Google has to learn a valulable lesson.

bitpush•1h ago

How's BBC lawsuit against Perplexity affect Google? Did you not read the article?

simonw•3h ago

It looks to me like this is mainly about RAG - Perplexity answers user questions by running searches and then displaying content from those searches to users, and the BBC are arguing that this content display violates their copyright.

Unsurprisingly this article confuses the issue somewhat by also talking about training models on content. I understand why that's in there - it's a hot topic, especially in the UK right now - but I don't think it's directly relevant to this complaint.

The note about robots.txt is interesting - "The BBC said in its letter that while it disallowed two of Perplexity's crawlers, the company "is clearly not respecting robots.txt".

Perplexity describe their user-agents here: https://docs.perplexity.ai/guides/bots

I had a look at https://www.bbc.com/robots.txt and it does indeed block both PerplexityBot ("designed to surface and link websites in search results on Perplexity" - I think that's their search index crawler) and Perplexity-User ("When users ask Perplexity a question, it might visit a web page to help provide an accurate answer and include a link to the page in its response").

But... I checked the Internet Archive for a random earlier date - Feb 2025 - https://web.archive.org/web/20250208052005/https://www.bbc.c... - and back then the BBC were blocking PerplexityBot but not Perplexity-User.

hadrien01•3h ago

They also write this:

> Since a user requested the fetch, this fetcher generally ignores robots.txt rules.

dabeeeenster•3h ago

I mean, that's just not true.

esskay•1h ago

Which part? It's widely established and known that many AI crawlers are ignoring the robots.txt file, perplexity being one of them [1]

[1]https://www.tomshardware.com/tech-industry/artificial-intell...

simonw•3h ago

Oh wow, I missed that! That's from the docs for that Perplexity‑User user-agent, at which point presumably there's no point in listing that in robots.txt at all?

bitpush•34m ago

> Since a user requested the fetch, this fetcher generally ignores robots.txt rules.

Normally the expecation is that the user-agent faithfully presents the content it fetched.

If I make a browser that fetches bbc.com, and strips away ads and presented it to users - I would expect BBC to not like it and block the user-agent from accessing it. It isnt a robots.txt thing. It is a user-agent thing.

whilenot-dev•3h ago

For what its worth, this statement here regarding Perplexity-User:

> Since a user requested the fetch, this fetcher generally ignores robots.txt rules.

...has been added sometime between 30.01.2025[0] and 07.02.2025[1], and makes it sound like robots.txt was not respected by that bot anyways.

[0]: https://web.archive.org/web/20250130164401/https://docs.perp...

[1]: https://web.archive.org/web/20250207113929/https://docs.perp...

simonw•3h ago

Great catch there.

seydor•3h ago

> In a statement, Perplexity said: "The BBC's claims are just one more part of the overwhelming evidence that the BBC will do anything to preserve Google's illegal monopoly."

Unless perplexity has a way to indirectly pay writers the way google does, this is very rich

> four popular AI chatbots - including Perplexity AI - were inaccurately summarising news stories, including some BBC content.

One of the interesting things about the failures of LLMs is that news sources have become more concise and more authoritative. Even google fails to get facts right with its AI summaries, so one is compelled even more to go read the website instead. And I'm not sure if LLMs will ever be able to grasp true from lies.

fcatalan•3h ago

To be honest not visiting some websites is one of my main uses of Perplexity.

For example I like to watch F1 and I like to know the times for all sessions in my timezone during the weekend.

It's surprisingly hard to find this information, because the Google search is SEOed to hell and back by sites that hide the information behind endless articles full of irrelevant AI slop and 2 million intrusive ads, and that's if they have it right or at all.

Perplexity wades through all that shit, gives me a neatly formatted table and has never been wrong so far.

So I can see where the BBC is coming from but I also don't really want them to win.

bitpush•1h ago

> To be honest not visiting some websites is one of my main uses of Perplexity.

I use it the same way as well, but everytime I use it .. I feel icky. A sense of impending doom.

Imagine a book summaries service, that helped users not buy any books ever. What is the incentive for a writer to write a book, when they know that in ~mins, the summary of the work will be available on a different site.

News sites are unique in that the value they provide, for the most part, is the realtime-ness of it. BBC reporting on latest in London is the work of soo many journalists and if Perplexity sidesteps that - BBC has no incentive (and in the future, money) to do that work. It kills BBC, and it ultimately kilss Perplexity.

So yes, Perplexity is playing a very dangerous short term game, and BBC is right in suing them.

> BBC is coming from but I also don't really want them to win.

If BBC doesnt win, BBC (and other sites that "produce" information) dies which kills Perplexity.

riskable•2h ago

How is Perplexity different from running a Jupyter Notebook or anything, really that lets you download a web page programmatically? I can spin up an AWS instance, login then run `python` and scrape the BBC's content as much as I want. Why aren't they suing Amazon (and every other company that lets you download stuff via their systems) for providing the same functionality?

A very old argument: If you don't want people scraping or downloading your content don't put it on the (public) Internet!

Imagine we had LLM-like functionality in the 1980s: Sony announces a new VCR that can read a recorded news show and print out a summary on a connected Imagewriter II. People start using it to summarize the publicly-broadcast BBC news programs.

Today's scenario would be like the BBC sues Sony for providing that functionality.

ethbr1•2h ago

Because copyright is intrinsically linked to scale.

1000000x'ing fair use... might no longer be fair use.

The balances between society and copyright need to change when scale changes drastically.

To address the elephant in the room -- what happens when there are only leachers and no sources, because we've let them hijack first-party news revenue without creating a replacement?

Visualizing environmental costs of war in Hayao Miyazaki's Nausicaä

Show HN: Nxtscape – an open-source agentic browser

Rolling the ladder up behind us

Phoenix.new – Remote AI Runtime for Phoenix

Cracovians: The Twisted Twins of Matrices

YouTube's new anti-adblock measures

Oklo, the Earth's Two-billion-year-old only Known Natural Nuclear Reactor (2018)

Klong: A Simple Array Language

How to Design Programs 2nd Ed (2024)

Hurl: Run and test HTTP requests with plain text

A Brief, Incomplete, and Mostly Wrong History of Robotics

A Python-first data lakehouse

Minimal auto-differentiation engine in Rust (for educational purposes)

A deep-dive explainer on Ink and Switch's BeeKEM protocol

College baseball, venture capital, and the long maybe

HCP Vault Secrets End of Life

Show HN: I wrote a new BitTorrent tracker in Elixir

Meta announces Oakley smart glasses

Reworking Memory Management in CRuby [pdf]

Asterinas: A new Linux-compatible kernel project

Congestion pricing in Manhattan is a predictable success

Show HN: SecureBuild – Zero-CVE Images That Pay OSS Projects

ELIZA Reanimated: Restoring the Mother of All Chatbots

Qfex (YC X25) – Back End Engineer for a 24/7 Stock Exchange

Show HN: Ts-SSH – SSH over Tailscale without running the daemon

The Right Chemistry: How Jean Harlow became a ‘platinum blond’ (2020)

Compiling LLMs into a MegaKernel: A path to low-latency inference

Giant, all-seeing telescope is set to revolutionize astronomy

Mierle Laderman Ukeles, a '70s artist who became a hero to 'garbage men'

NASA Scientists Find Ties Between Earth's Oxygen and Magnetic Field

BBC threatens AI firm with legal action over unauthorised content use

Comments

Visualizing environmental costs of war in Hayao Miyazaki's Nausicaä

Show HN: Nxtscape – an open-source agentic browser

Rolling the ladder up behind us

Phoenix.new – Remote AI Runtime for Phoenix

Cracovians: The Twisted Twins of Matrices

YouTube's new anti-adblock measures

Oklo, the Earth's Two-billion-year-old only Known Natural Nuclear Reactor (2018)

Klong: A Simple Array Language

How to Design Programs 2nd Ed (2024)

Hurl: Run and test HTTP requests with plain text

A Brief, Incomplete, and Mostly Wrong History of Robotics

A Python-first data lakehouse

Minimal auto-differentiation engine in Rust (for educational purposes)

A deep-dive explainer on Ink and Switch's BeeKEM protocol

College baseball, venture capital, and the long maybe

HCP Vault Secrets End of Life

Show HN: I wrote a new BitTorrent tracker in Elixir

Meta announces Oakley smart glasses

Reworking Memory Management in CRuby [pdf]

Asterinas: A new Linux-compatible kernel project

Congestion pricing in Manhattan is a predictable success

Show HN: SecureBuild – Zero-CVE Images That Pay OSS Projects

ELIZA Reanimated: Restoring the Mother of All Chatbots

Qfex (YC X25) – Back End Engineer for a 24/7 Stock Exchange

Show HN: Ts-SSH – SSH over Tailscale without running the daemon

The Right Chemistry: How Jean Harlow became a ‘platinum blond’ (2020)

Compiling LLMs into a MegaKernel: A path to low-latency inference

Giant, all-seeing telescope is set to revolutionize astronomy

Mierle Laderman Ukeles, a '70s artist who became a hero to 'garbage men'

NASA Scientists Find Ties Between Earth's Oxygen and Magnetic Field