We can dream. ISP says NO!
This is why most NOCs enforce an asymmetric firewall bandwidth limit for download focused customers, and install colo Google/CDN rack appliances.
Most ISP are just the modern retooled Cable company business. Try to stream video off your home platform on a normal service port, and you will hit the caps pretty quickly. =3
I think of these things like they only exist in the past but I'm sure these places are still around only I don't go to them as much now. Probably because all the useful stuff these days exist in mainly reddit and the stackexchanges.
If no, you can still do that today.
If yes, then imagine the other 1,000,000 people who think likewise. That was how we got where we are today.
(https://hn.algolia.com/?dateRange=pastWeek&page=0&prefix=tru...)
- "Foundation", Isaac Asimov
Nobody at the head of these large tech firms has a clue about where this technology is heading.
Edit: Re-read it and just noticed this:
> assuming the AI companies are willing to step up, support the ecosystem, and pay for the content that is the most valuable to them
Lolllllll. Y'all are gonna love my pitch for an airline startup. It begins, "assuming we can turn off gravity..."
Paying a security guard isn't considered a protection racket, while paying a member of the mob so that nothing happens to my store is considered a protection racket.
The problem is the arguments they make for why this should happen are quite compelling, especially to those running sites (you'll see plenty of complaints on this forum about it), but theres also a large group of people who think information / code / data should be "free" (see open source code/maps/anything you can think off). So really its just a moral debate that will be lost in the interest of profit (which is ya know good n bad, if AI companies did more caching we probably wouldn't need this, but here we are).
[0] https://blog.cloudflare.com/introducing-ai-crawl-control/
Cloudflare blocks AI scrapers unless they pay the toll.
> What's most interesting is what content companies are getting the best deals. It's not the ragebait headline writers. It's not the news organizations writing yet another take on what's going on in politics. It's not the spammy content farms full of drivel. Instead, it's Reddit and other quirky corners that best remind us of the Internet of old. For those of you old enough, think back to the Internet not of the last 15 years but of the last 35. We’ve lost some of what made that early Internet great, but there are indications that we might finally have the incentives to bring more of it back.
> It seems increasingly likely that in our future, AI-driven Internet — assuming the AI companies are willing to step up, support the ecosystem, and pay for the content that is the most valuable to them — it’s the creative, local, unique, original content that’ll be worth the most. And, if you’re like us, the thing you as an Internet consumer are craving more of is creative, local, unique, original content. And, it turns out, having talked with many of them, that’s the content that content creators are most excited to create.
What is most likely to happen is the so called Answer Engines will embed advertising into their results -- except in a more insidious, subtle, hard to detect and filter out manner.
The Open Letter reads naive at best, asking us to imagine an Internet where creators are rewarded for "filling the holes in human knowledge". We all know that is not what sells, and that the opposite of this will continue to inundate the infosphere.
This is the part of the new AI paradigm that concerns me the most, and I think you are correct.
If "pay for placement" (or worse "legislate for placement") in LLM training becomes a thing, then we lose all transparency as these biases get baked into the knowledge set, and users have no way of knowing where and when they get applied.
This discussion is going to be rife with the pot calling the kettle black as people who have blocked every ad for the last 15 years call out AI for not compensating creators...
Fantastic! I wish I could undermine their clickbait business model even more..
> But there’s reason for optimism
You mean Cloudflare being investigated for antitrust?
This is also being attempted by RSL with their “Crawler Authentication Protocol” (https://rslstandard.org/guide/web-crawlers) for demanding proof of licensing from scrapers and RSL Collective (https://rslcollective.org) for providing the licensing itself. The missing piece there is the ability to detect scrapers with high accuracy without punishing regular browsing humans.
So you use a public resource and presumably like the upside that comes with sharing on it, but you want to limit uses now that someone has found one that you don't like, so you're fine with degrading that public resource?
You could always share things privately or behind a paywall if you don't want them available publicly. But people seem to want to have their cake and eat it too.
I get why a hosting provider would want to limit crawlers to save bandwidth. The "creator" angle is just greed.
Can you say more? what is the "public resource" I'm using as a content publisher? In that role I see myself more as the provider of a public resource (my content), not a user.
It should be front and center, communicated clearly, and easy to understand. If I was an investor, the lack of clarity after reading a few paragraphs would concern me.
As a content consumer, I'm also hoping to be part of the ecosystem. I already use Patreon a lot as "AdBlock absolution", but it doesn't fix the market dynamics. Major content platforms tend to stagnate or worsen over time, because they prefer to sell impressions to advertisers than a good product to consumers.
Imagine trying to create and distribute a movie without the backing of your local distributor/broadcasting cartel. Only instead of gating the movie theatre, Cloudflare is the single access to all things web. Also, based in the US, so good luck producing content their government doesn’t like.
> You could imagine an AI company suggesting back to creators that they need more created about topics they may not have enough content about. Say, for example, the carrying capacity of unladened swallows because they know their subscribers of a certain age and proclivity are always looking for answers about that topic. The very pruning algorithms the AI companies use today form a roadmap for what content is worth enough to not be pruned but paid for.
They're not just making it up..
Would AI companies be willing to match that (even 50% match)? If not, we might just end up with low paid copyrighters / ghost writers churning out large amounts of content for LLMs in subject areas where they don't necessarily have expertise.
Not mentioned: there would be a single gatekeeper for the internet, Cloudflare.
The "level playing field" rhetoric reminds me so much of Apple talking about the App Store. This new internet business model is just the App Store, substituting websites for apps and Cloudflare for Apple. The system only works with some middleman between the AI companies and the content creators.
In which case for the self-host people we can just pick a decent CDN?
For mostly-static or static-site content it's pretty nice all around. I've not gotten into the SQLite service so much though, which I might and seems interesting.. there's also Turso to consider as an alternative option... Not to mention Deno, Fastly and other similar options.
Kind of hoping to see something similar start to gain traction to support wasm backend systems similarly.
This is beyond their DDoS/Proxy protection, but worth considering as part of what they offer. There's a lot there to like.
This feels like a lose-lose situation.
Only if you somehow believe you have a choice, or even a say.
https://news.ycombinator.com/item?id=45332860
That's not nothing.
Nothing in their idea challenges the underlying tech behind the internet. Anyone is free to compete in constructing a reverse proxy service with LLM-centric content controls similar to cloudflare, whether that’s AWS WAF or akamai or some new startup.
From the stats I've seen, Cloudflare has an 80% market share for reverse proxy services. 20% of all websites use Cloudflare, 50% of the most popular websites globally. That's a dangerous amount of concentration, and it's the only reason Cloudflare can propose this new business model for the internet and be taken seriously.
Bing/msn.com failed to displace Google because Google was simply better, not because Google played dirty.
Where did I mention hate? I don't care what emotions you feel or don't feel. The problem is the concentration of power in one company. That has nothing to do with emotion.
> Bing/msn.com failed to displace Google because Google was simply better, not because Google played dirty.
I don't think the courts agree with you about Google playing dirty. In any case, monopolies are inherently dangerous.
But it's true? It's still true today. The only worrying part of the story is that google also makes browser and OS, which doesn't apply to Cloudflare.
The above comparison to App Store is even weirder / more ridiculous. App devs publish on App Store because App Store is pre-installed on every iPhone already, so it maximizes the number of users they can reach. Websites use Cloudflare to protect themselves, at the cost of reducing the number of users they can reach. The two situations are so different that "false equivalence" is an understatement.
Well actually: https://blog.cloudflare.com/supporting-the-future-of-the-ope...
> App devs publish on App Store because App Store is pre-installed on every iPhone already, so it maximizes the number of users they can reach.
This seems like a weird statement, because App Store is the only way of publishing apps on iPhone. The statement might make sense if you were talking about the Mac, on which App Store is pre-installed, but developers can still publish outside the App Store.
> Websites use Cloudflare to protect themselves, at the cost of reducing the number of users they can reach.
How does Cloudflare reduce the number of users they can reach?
I wouldn’t recommend trusting any large company but so far Cloudflare doesn’t appear to be pulling a Google because they sell directly rather than to third parties. Google never charged for search so they ended up doing a reverse acquisition into DoubleClick to get advertisers to pay for the searches we do. Cloudflare does have a free tier but their paid services are decidedly not free and since they have serious competition in the CDN business, zero-trust, etc. they have the direct incentive not to screw their customers which Google lacked. I’d get worried if that ever changes.
I envision a UI that displays the message and, in a sidebar, lists what aspects of the message classify it as hate speech. Then, like a spam filter, you could decide to block the message.
https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=1084806... https://arxiv.org/abs/2405.01577 https://arxiv.org/abs/2401.03346
I don't get it. Can you point out where Cloudflare gains the power to prevent Google or any other company from doing the same?
It's not impossible. But it's hard.
Cloudflare is already a monopoly though. From what I can tell, what they are saying here, besides proclaiming their continued existence, is that they and AI companies and content creators need the internet to exist.
And what they are building is on top of the 402 response, which anyone can use? So you could use that without using any CDN at all, without too much development cost?
I don't fundamentally think that Cloudflare's view here is _wrong_ - by and large, when I google what internal temperature to cook pork to so I don't die, I'm not looking for an opinion or someone's life story - but I think I'm much more interested in how we create the weird niches that create the "knowledge" for the blender. Cloudflare alludes to it with the talk about Reddit and whatnot, but I'd love to know what the plan is to actually create and nurture those communities where people can really get weird into whatever topic they're interested in. Right now we're all sort of just ghettoized into various Facebook communities or whatever, but recreating an actual vibrant communal internet where people can find their weirdo subcultures and actually negotiate on some kind of reasonable footing with the LLM scrapers (who were the social network people, who were the search engine people, who have always been the ad people) would be a genuine improvement to the internet.
* read "hellish swamp diving affair"
The landlord of the marketplace should probably not dabble in the appraisal of products, whether for factuality or value.
This is an attempt to rewrite history. Back in the day, we stood up servers at our own expense and filled them with content for free, for nothing other than the fun of it. In fact, the "vibrancy" of the internet appears to be inversely correlated to the number of people using it to generate a profit.
That's the "never been free" in that statement.
I don't think they're saying what you think they're saying.
So, same as it is today.
I don’t know if I see a company extracting rent from other people as a win…
What could possibly go wrong?
Because the providers will act as a single tunnel that all content passes through before reaching the end user, the tolls they collect will be large. So, I don't doubt that there will still be opportunities for content creators to earn money as answer engines siphon off more and more of the web's traffic, but expect those opportunities to be broadly low-paying, falling decidedly in the "side hustle" category.
AI providers will want to incentivize content creation. There will still be a glut of ready providers, and little reason for providers to make anything but small, nominal payments.
If the open internet is already dead (and it is already dead), it’s better to accept that reality and silo off the good parts behind paywalls so that people can get paid, rather than to let bad people benefit massively from it while they build their walled gardens. This has been a long time coming.
"Content creators" are part of the problem. Generating endless "content", which isn't very useful or valuable, and creates too much "content supply", which devalues it. This then creates a giant "soup" of content that viewers drown in, trying to find some content that isn't as identically useless as all the rest. But the big incumbents love it, because they use this content soup to collect money from - you guessed it - ad companies. (Those same companies they don't want to make ad-blockers against...)
So now that search is dead, they need to find a new way to drink from the ad-dollar faucet. At first it'll be "pay content creators from AI subscription money", but then AI will be offered "free with ads", and then later "paid with ads". And nobody's going to stop it, because everyone is "happy": the viewer gets their free crap (with ads), the content creators get a few pennies, the big companies rake in billions, and ad companies continue their "industrial welfare" by pouring their excess profits into this whole system as ad-dollars. The great commoditization of eyeballs continues.
I'm not sure why there would be an expectation from most users that AI should compensate creators.
Their "map of human knowledge" Swiss cheese model assumes all "content" is information. There are many other types of "content", which don't fit in this "Answer Engine" peg. Maybe it's the only content LLMs care about, but then this isn't a "new business model for the web", it's a "new business model for AI". They say the AI will tell content creators what gaps there are in "human knowledge" (?), and pay them to fill those gaps. But that's not how content works. (1) Most "content" is derivative, not new, yet can still be tremendously valuable. They're not addressing most monetization issues linked with AI. (2) You often don't know a "gap" exists until it's filled (people doubt you at the start, then once you succeed they copy you). (3) That is not how "creative, local, unique, original" content is created. It's created bottom-up, spontaneously, often with zero monetization. They cite Reddit; the reason we all started adding "site:reddit.com" is because of the non-monetized spontaneous comments. You can't replace that with a top-down monetized model that tells you what to create. At that point, why not have the AI create/generate it by itself, right? You can't cram "creative, local, unique, original" into an "Answer Engine" peg without losing everything that made it valuable, because Answer Engines will be met with Answer Engine Optimization.
Creators should be able to trace and quantify exactly what data of theirs was fed into the grift machine and be reimbursed directly by the grift machine custodian each time their data is used to generate new output.
A middleman who collects a giant chunk of creator royalties, for data that will be used perpetually by the grift machine... that sounds like a bad deal.
ads have been the only micropayment
system that has worked
Why are micropayments so hard?I wonder how the web would look like if one could click "pay 1 cent to continue".
Maybe content would become better? Maybe it would make one think "Hmm... one moment, is this something I want to read or am I just doomscrolling?".
Regulation.
Realistically, what was the benefit of the ad-driven model? From my point of view, most of the highly-valuable information came from the various forums, wikis and personal sites, on all of which people would publish the information for free. Ads were largely used to cover hosting costs of the large forums and wikis, but the content creators saw not a single dime.
Over the last 15 years, the search results have shifted from this, to instead become 100% news articles and SEO spam, all with a lot of fluff and very little substance, tons of ads, pop-ups, autoplay videos and subscription walls. All of this is well funded thanks to the business model, but for what? I can't imagine anyone sitting through all that crap to get to what they were looking for. There's a reason people would tell others to add "reddit" to the search query only a few years back, and even that's becoming less worthwhile.
Is this really what we want to preserve? Big publications and their interests?
They are proposing a scheme by which AI scrapers would have to pay for the content they scrape, which could replace the ad-driven model and be viable for more creators.
(Note that everything CloudFlare talked about in this blog post also applies to adblock users, not just AI agents.)
The golden age of the Internet is not where people do it for money, or for views. That way lies clickbait and content farms. The golden age of the Internet is one where people share information because they want to.
Source: https://1gn15.com/cloudflare
The bigger problem I see coming down the line, given the current business model for the internet, is what happens when investors need a return. When the LLM only provides one “correct” answer, the only way to make money from advertisers is to influence that answer. At least just now we are presented with information which we evaluate ourselves, though even this is subject to manipulation.
That said, the mental model that this article uses only recognizes that new content can fill in "holes" (interpolation). It also can expand the boundaries in new directions (extrapolation). That is a different and harder problem. If you distribute money to people "based on what most fills in the holes in the cheese", you really aren't expanding the boundaries of human knowledge as much as you are strengthening existing knowledge. They need to take boundary expansion into account here.
I also recognize that we know where the holes or boundaries are in many fields. They are "known unknowns". But this proposal does not take into account "unknown unknowns" --- things that we do not even realize that we don't know yet. It's going to be harder to incentive research into unknown unknowns when we don't even know what they are yet.
> What's most interesting is what content companies are getting the best deals. It's not the ragebait headline writers. It's not the news organizations writing yet another take on what's going on in politics. It's not the spammy content farms full of drivel. Instead, it's Reddit and other quirky corners that best remind us of the Internet of old.
Removing all the vibrant parts is long due, apparently :')
The new model will be something like:
1. A content creator creates a web site and uses Cloudflare.
2. AI companies pay Cloudflare to allow them to scrape content.
3. Cloudflare gives a cut to the content creator.
4. Users pay AI companies and get their questions answered.
A few observations/predictions:
* If this works, there will be competitors to Cloudflare (AWS, Microsoft, etc.) who will offer better terms to content creators. Content creators can then (easily) switch to whichever reverse-proxy has the best terms.
* Media companies will transform into Cloudflare competitors, aggregating content and monetizing by selling to AI. Their pitch will be that the content will be more curated than Cloudflare. Their brands might survive if the AIs pass the source of the content all the way to the user. For example, the AI says something like, "According to a BBC contributor....". Otherwise, media brands will no longer be known to consumers (only AI companies will care).
* If this works, AI companies will try to cut out the middle-man by building their own ecosystem of content creators.
* As more and more people get their answers directly from AI, it will be easier to sell content directly to AI companies. I.e., instead of publishing something on the open web and relying on Robots.txt to protect your content, you will sell content straight to the AI company. NOTE: If this happens, then the only way this will scale is if the AI itself decides which content it wants to buy for the next training run.
* At the limit, the web and everything about it basically disappears. Everyone gets their content directly from an AI and never visits a web site directly. Therefore, web sites disappear and all that's left is the HTTP protocol, which is used by AI clients to talk to the AI cloud.
Ok the the toxic nature of the internet and social media in general and what has become of our digital age, totally agree that it's rage and click bait. I wrote something to that effect here https://github.com/micro/mu/issues/27. But I personally don't think we're going to directly interact with agents the whole time. They will exist, they will be somewhere in the middle layers, there might even be a chat interface that replaces search queries with answers but I think the whole web as a whole and social media needs a rethink. Ads as a business model has to die, even though clearly it won't and we need to shift our attention elsewhere.
rhetocj23•1h ago