I don't think anyone has it sorted yet. LLM search will always be flawed due to being a next token guesser - it cannot be trusted for "facts". A LLM fact is not even a considered opinion, it is simply next token guessing. LLMs certainly cannot be trusted for "current affairs" - they will always be out of date, by definition (needs training)
Modern search - Goog or Bing or whatever - seem to be somewhat confused, ad riddled and stuffed with rubbish results at the top.
I've populated a uBlacklist with some popular lists and the results of my own encounters. DDG and co are mostly useful now, for me.
I've entirely given up on Google.
I've made extensive shortcuts so I can directly search various sites straight from my location bar: wikipedia, wiktionary, urbandictionary, genius, imdb, onelook, knowyourmeme, and about two dozen suppliers/distributors/retailers where I regularly shop.
If I need something that's not on that list, I'll try some search engines but I start with the assumption that I'm not going to find it, because the battle for search is lost.
And if people on dialup connections think you’re slow, it’s because you are.
The web has changed these days, it's an adversarial system now, where web results are aggressively bad and constantly trying to trick you. Google is much harder to implement now.
These days I can't even -exclude terms that I know would only appear in the wrong results, Google will show me those results anyway. Nothing about adversarial SEO requires them to ignore my input, that's a different choice.
I have used Google very little for about 3 years now. Sometimes when DDG fails to find what I'm looking for I'll try Google. It rarely works better.
I tried to switch to DDG because Google was blocking Hurricane Electric IPv6 tunnels. DDG is still my homepage but I usually end up clicking the bookmark I made for ipv4.google.com. I wish I knew why DDG works for all you people but it's horrible for me.
I don't understand why they got rid of these escape hatches. Sometimes I want the "top" pages containing precisely the text I enter -- no stemming, synonyms, etc. Maybe it shouldn't be the default, but why make it impossible?
In my ideal search world, there would also be an option to eliminate any page with a display ad or affiliate link. Sometimes I only want the pages that aren't trying to make money off of me.
This has to track number of ads and trackers in a page and not just be about product pages. This measure should also fight SEO spam, as the tracking and advertising elements would cause SEO spammers to lose rank on the engine (disincentivising an arms race).
Add in the patently obvious need for the poweruser's 2nd search bar, which takes set notation statements and at least one of a few popular powerful regex languages, and finally add cookie stored, user-suppliable domain blacklists and whitelists (which can be downloaded as a .txt and reuploaded later on a new browser profile if needed). I never ever want to see Experts Exchange for any reason in my results, as an immediately grasped example. Give the users more control, quit automagicking everything behind a conversationally universal idiot-bar!
An "advanced mode" supporting literal keywords (with and without stemming) and boolean operators wouldn't cost the search companies anything. I think supporting regexp search would be hard: do you search your index for fixed substrings and expand around them? I'm not a search person...
I don't think you'd need much in the way of machine learning to filter out the spam. There are relatively few third-party display ad servers and affiliate networks, and those are the main lazy ways to make money. There's no need to filter out all commercial content; just getting rid of the "passive income" bros would be enough.
People used to spend money on books and magazines, I'm sure some of them could be convinced to sign up for a Netflix of books and magazines.
It's how the vast majority of human knowledge has been stored and perpetuated for millennia.
This new business of people writing up their knowledge for free (wikpedia, stackexchange, forums, reddit, etc) is relatively new, and only semi-working.
Even today, encyclopedia briticana is still selling an encyclopedia product for about $75 a year.
The new business model is competiting with the old one. Generally the new model is winning on price and breadth, by a fairly wide margin. Accuracy seems about the same for both business models.
A business model doesn't need to be perfect to win, it just needs to be slightly better.
> It's how the vast majority of human knowledge has been stored and perpetuated for millennia.
I think writing books for money is a newish phenomenom. I think historically you probably had more of a patronage system for most art. Selling your written work is a lot harder in the pre-printing press era.
That being said Wikipedia is not a substitute of the "old" web.
There's a very strong financial incentive for ad-powered search engines to keep SEO spam out of search results, because that makes advertisers more willing to pay for search placement. A publicly run search engine would not have those incentives and would be at if anything graver risk of a "tragedy of the commons" type scenario where the engine is overtaken with spam.
Yes, there are perverse incentives to populate search engine results with paid placements, but the best corrective force I can think of is having more competition in the search space. As long as people are willing to try other search engines (spoiler: for the most part, they are currently not), this creates a strong incentive to ensure that paid placements that harm the search experience are kept to a minimum.
...and I think the concerns about profitability of the LLM space is completely missing the larger agenda. Even if public use of OpenAI and its competitors NEVER TURNS A PROFIT, there is tremendous economic opportunity that investors expect to realize from a company with intelligent/powerful LLMs. That is why they are pumping so much money into these companies.
>A publicly run search engine would not have those incentives and would be at if anything graver risk of a "tragedy of the commons" type scenario where the engine is overtaken with spam.
But then how come that is exactly what is happening with modern search engines? It’s just always advertising that comes along and fucks up a good thing.
Because Google runs Google Search, it's instead only bad because profit motivates Google to push its services, increase ad impressions/interactions and incentivize users to not actually leave the search engine result page (e.g. by citing or summarizing content of the related web pages on the result page).
And because competition is good that means Google actively competing with the sites it indexes for the attention of its users is a good thing even if those sites losing out and failing would result in worse search results over time.
Hang on, maybe "content stealing" really already was a problem before LLMs made it their entire MO and those greedy newspaper publishers were onto something when they complained about Google lifting their news feeds even if it provided "exposure" for them.
No. Shitty results keep people exposed to ads for longer, on average.
'A publicly run search engine would not have those incentives and would be at if anything graver risk of a "tragedy of the commons" type scenario where the engine is overtaken with spam.'
You might be pleasantly surprised by the main work of the 2009 Nobel prize winner in economics, Elinor Ostrom's Governing the Commons.
Google gets money from sending users to ad-laden sites. They get to double-dip.
If Google _didn't_ own Doubleclick etc, then there would be an incentive for them to prioritise content over content farms etc.
Given that Kagi's higher tier plans come with search-enabled LLM chat interfaces, and those searches use Kagi's results (which, again, appear to be superior) it seems to me that you get the best of both worlds: Better search, and better search results to feed into your search-enabled LLM queries.
I am not affiliated with Kagi or anything, it's just honestly that good a product.
It never occurred to me I could use Google this way. And it is a novel idea to me, that it seems to be better to use AI than read a manual.
If it's not then it will just invent some plausible-souding bullshit that doesn't actually work.
After the fifth time you get burnt by this the whole LLM experience starts to sour.
Don’t think it’s their fault or that it happens more than everywhere else neither what they could do about it but it happens.
The internet is dead and is starting to smell.
However, the reason I started paying for Kagi was because they let me completely block websites from search results -- and they still let me do that. That feature alone will keep me as a paying customer for the near term.
This was in the top three results, and I can't tell if it's "real", or just a page created to capture those exact search terms!
If the goal of that site really is just to capture clicks via very specific web searches... and people are creating sites like that at scale... what hope do we have of saving the web? :-(
Yet. It's only a matter of time before AI becomes ad-riddled and enshittified.
Sky TV makes 10 times as much from subscription as adverts but spends 30% of the time showing you adverts. London underground revenue is a similar ratio - for every £9 tickets they make £1 in adverts. If I go to the cinema they spend 20 minutes showing adverts to people who spent £50 on tickets and popcorn.
Companies shave very little incentive not to make things shit with adverts. The measurable cost to them is tiny, the cost to the rest of the world is massive. Odeon won’t attribute lost revenue from my reduced visits to their adverts, but will measure the 50p or whatever they get.
Model weights are snapshots, and we can preserve them.
It would be like if we could keep a snapshot of the search index for google every 6 months. Doesn't matter if the "current" version is garbage, if my search target exists in an older copy that's not as corrupted, and I can choose to use that instead.
And at least this time around, I think this was built in from the start - you pin against a specific model for most serious business use-cases.
I can store open model weights locally, cheaply.
---
So I 100% agree that the ads are going to come (I can't forsee any possible alternative outside of banning ad based content promotion - which as an aside... I'm strongly in favor of proposing as serious legislation, particularly in the context of AI).
But this time the ad riddled version still has to be better than the old version I can boot up and run.
It'll be interesting to see how that tension plays out.
I am also a heavy Kagi and Reddit user for search, and usually that's enough. But when it's not, its concerning how much better other search engines can be, especially since non-tech savvy folks will never use them.
apparently, I need to make a selection of a search engine to use this.
I would not use this as a replacement for my duckduckgo or google searches simply because of the UX of not being able to type a query and press "enter" as the default.
You can probably hack that experience by making use of the "rules" feature. You can have certain search engines or macros launch automatically upon pressing enter based on the content of the query. You if you set a rule to check if your search contains a vowel (which most will), it's effectively a catch all rule.
Hacky, but it will work.
Asking for "the Internet" to be funded the same way as real libraries is quite the contrary to the dominant cultural narrative which asks for public services and entire governments to be operated "like a business", which usually means cutting funding, selling assets, doing layoffs and eventually scrapping it for parts once it has become predictably defunct.
I'm not aware of any advertising funded libraries although I'm sure the idea has been considered. I think that in the physical world the cost per visitor in a library is probably too high for such a funding model to be successful. Also the library is not typically visited by the highest value advertising targets limiting the amount which can be raised this way. I think these are probably the real reason why real libraries aren't like this!
They are incredibly good value for money, if you understand the benefits and take a long view. However, politically they are an easy thing to cut.
Libraries are not advertising funded because advertisers are not very interested.
If advertisers were interested then these libraries would likely be just as infested with adverts as the internet.
If we tried to fund the internet in the same way as libraries advertisers would still be strongly incentivised to show adverts and would still work hard to find ways to do so. So it's not clear the situation would change much if there was government funding for content creators.
> This is just my anecdata
Not a single example, so it's not even that, just vibes
> the past few years. There’s recent science to read about the quality of search.
Let's look at the "science"
> We monitored Google, Bing and DuckDuckGo for a year on 7,392 product review queries.
Oh, so there is nothing about the past few years either. And this isn't "search", but a very narrow category of search that is one of the prime targets for SEO scam, so it's always been bad, and the same incentives made review content highly suspect before any SEO was ever involved.
Which multiple engines can help you here? Which LLMs?
And well, the article is ostensibly about AI, but then at the end:
> The investors aren’t just doing this to be nice. Someone is going to expect returns on this huge gamble at some point. > ... > The LLM providers aren’t librarians providing a public service. They’re businesses that have to find a way to earn a ridiculous amount of money for a huge number of big investors, and capitalism does not have builtin morals.
Those are the things that need to change. They have nothing to do with AI. AI is a symptom of a broken socioeconomic system that allows a small (not "huge" in the scheme of things) number of people to "gamble" and then attempt to rig the table so their gamble succeeds.
AI is a cute bunny rabbit and our runaway-inequality-based socioeconomic system is the vat of toxic waste that turned that innocent little bunny into a ravening mutant. Yes, it's bad and needs to be killed, but we'll just be overrun by a million more like it if we don't find a way to lock away that toxic waste.
Not inherently, but I think LLM services (and maybe other AI based stuff) are corruptible in a much more dangerous way than the things our socioeconomic system has corrupted so far.
Having companies pay to end up on the top of the search engine pile is one thing, but being able to weave commerciality into what are effectively conversations between vulnerable users and an entity they trust is a whole other level of terrible.
Many of us - naively, in hindsight - really did hope this wouldn't happen at the scale it did, and were appalled at how many big players actively participated in speeding up the process.
I guess it's similar to how a lot of white folks thought racism was over until Obama came along and brought the bigots out of the woodwork.
> lock away that toxic waste
The jarring conclusion I keep trying to see a way around but no longer can is that the toxic waste is part of humanity. How do we get rid of it, or lock it away? One of the oldest questions our species has ever faced. Hard not to just throw up your hands and duck back into your hidey-hole once you realize this.
Sure, maybe so. But now with hindsight we can see what happened and we should realize that it's going to happen again unless we do something.
> The jarring conclusion I keep trying to see a way around but no longer can is that the toxic waste is part of humanity. How do we get rid of it, or lock it away? One of the oldest questions our species has ever faced. Hard not to just throw up your hands and duck back into your hidey-hole once you realize this.
I think both bad and good are part of humanity. In a sense this "toxic" part is not that different from the part that leads us to, say, descend into drug addiction, steal when we think no one is looking, leave a mess for other people to clean up, etc. We can do these negative things on various scales, but when we do them on a large scale we can screw one another over quite egregiously. The unique thing about humans is our ability to intentionally leverage the good aspects of our nature to hold the bad aspects in check. We've had various ways of doing this throughout history. We just need to accept that setting rules and expectations and enforcing them to prevent bad outcomes is no less "natural" for humans than giving free rein to our more harmful urges.
The EU creates an institution for public knowledge, a kind of library+tech solution. It probably funds classic libraries in member countries, but it also invests in tech. It dovetails nicely into a big push to get science to thrive in the EU etc.
The tech part makes a in-the-public-interest search engine and AI.
The techies are incentivised to try and whack-a-mole the classic SEO. E.g. they might spot pages that regurgitate, they might downscore sites that are ad-driven, they might upscore obvious sources of truth for things like government, they might downscore pages whose content changes too much etc.
And the AI part is not for product placement sale.
This would bring in a golden age of enlightenment, perhaps for - say - 20 years or so, before the inevitable erosion of base mission.
And all the strong data science types would want to work for it!
Which will be provided by a private sector contractor and it goes to the lowest bidder who offsets their costs with advertising.
In the same way that the early google made it great to put top minds on mundane problems, let’s imagine that an institute can make a knowledge-first search engine and AI. It’s about aligning incentives.
The closest equivalent thing we have today is (in my mind) places like the Apache Foundation or LetsEncrypt, places that run huge chunks of open source software or critical internet structure. An “Apache for search” would be great.
This is pretty much the best arrangement we have come up with so far in human civilization. You seem to be suggesting a tragedy of the commons is instead the ideal we should strive for.
I actually have a theory about this. I hate it, but I can absolutely imagine this future.
I'm going to specifically talk about the software engineering industry, but let's assume that LLMs progress to the stage where "vibe coding" can be applied to other areas ("vibe writing", "vibe research", "vibe security", "vibe art", "vibe doctors", "vibe management", "vibe CEOs", etc.)
It only takes a few years of "vibe coding graduates" to be successful in their work to create a new class of software engineer - this is in fact what AI companies are actively encouraging / envisioning as the future. Assuming this happens in the next few years, we're still in the phase where AI companies are burning money acquiring as many of these users as possible.
In about 5 years, some of those vibe coders will become vibe managers, and executives will no doubt be even more invested in LLMs as the solution to their problems.
At a certain tipping point, a large part of the industry can't actually function effectively without LLMs. I don't know when this point will be, but vibe coders (or other vibe <industry>ers) don't have to be a majority, they just have to be a large enough group.
Suddenly AI companies have all their losses called in and they have to pay back their VCs.
LLM usage prices skyrocket.
----
Four things happen across 2 axes:
- [A] Companies that can afford to pay skyrocketing LLM costs, vs. [B] those that can't
- [C] Companies that have reached a critical mass of vibe coders, vs. [D] those that haven't
----
[BC] These companies collapse. They don't have talent and they can't afford to outsource it to LLMs anymore.
[BD] These companies lay off all their vibe coders because they can't afford LLMs anymore. They survive on the talent they retain, but this looks very different if you're a large or small business. Small businesses probably fail too.
[AC] These companies see an enormous increase in costs that they cannot avoid. Large layoffs likely, but widespread vibe coding continues.
[AD] These companies have a decision to make: lay off all their vibe coders, or foot the LLM bill. The action they take will depend on their exact circumstances. Again, most small business in this situation probably fail.
---
The real question is, for the surviving vibe companies [AC, AD]. Will they be able to sustain such high costs in the long-run, and even if they can, will enough be able to sustain them to successfully pay back all the AI companies' losses to that date?
Interesting times ahead, maybe.
The amount of money it's burned on this is giant; those companies will need to make so much money to have any possibility of return. The idea is that we will all spend more money on AI that we spend on phones, and we will spend it on those companies only... I don't know, it just doesn't add up.
As a user it's a great free ride though. Maybe there IS such a thing as a free lunch after all!
I think with the last sub-clause (that I emphasized), you answered your question: because the conversation is more intimate, Google learns more about your "true interests", be it to make advertising to you more targeted, or for more sinister purposes.
They are also trying to reduce workers in search, etc, somewhat through the trained LLMs so the other idea is that they have lower costs per user.
I don't think many people will pay monthly fees or will want to pay them for each platform they use which is why they all tend to do so many questionable integration attempts to try to get users to not want to use a separate LLM of their choice in a browser instead.
Gemini Pro requires a monthly subscription.
Seems like a pretty straightforward business model.
You mean like how the instant we see ads on YouTube we will stop using it?
But what if you don't see distinct ads? LLM advertising can just be paid-for bias in the generated content.
Ad budgets aren't bottomless. People making decisions about them have a lot of options for where to spend. They want provable attribution so they can tell which channel is giving them the most bang for the buck. If that exists, then the ads will be discernible.
Now if AI turns out to be the next big thing, they can steer differently next time, sell subscriptions, and avoid all that entanglement with multi-sided markets and layered revenue strategies. At least that’s my take.
Which makes it all the weirder that they seem to be intentionally sabotaging it by nerfing Search into unusability.
Eventually they will break the "trust thermocline" in their search results and that will blow it up but on the way they'll keep making more and more money from every damaging change they make.
Why? The websites we visit can still be infested with google's ads, and so can our gmail accounts, and so can the youtube videos we watch, and they can push ads directly onto our cell phones 24/7. Google has plenty of ways to force ads into your life.
Google used to need search in order to build extensive dossiers on everyone. It told them what people were looking for online. What they were interested in. Now Google has their cell phones, their browser, and their DNS servers doing that for them. Most people are handing all of their browsing history to Google. Google doesn't need search, which is why it's been allowed to atrophy into uselessness.
People didn’t stop using Google Search, Facebook, Instagram, YouTube, TikTok and a myriad other services and products (and things like TV before those) because they got ads.
My companies currently spend more on AI than phones - hardware and subscriptions. It's now the second highest expense to salaries and director's remuneration.
But we are lily-white both legally and ethically. One of the perks to a lifestyle business beholden to no investors.
> The goals of the advertising business model do not always correspond to providing quality search to users.
- Sergey Brin and Lawrence Page, The Anatomy of a Large-Scale Hypertextual Web Search Engine
Read your own link:
> For example, entering the query "buy domain" into the search box on Google’s home page produces search results and an AdWords text advertisement that appears to the right of Google’s search results
> Google’s quick-loading AdWords text ads appear to the right of the Google search results and are highlighted as sponsored links, clearly separate from the search results. Google’s premium sponsorship ads will continue to appear at the top of the search results page.
what is this instinct? anyone that’s over the age of 25 would know
What's the difference? In astroturfing, someone pays people to form an organization, claim to have no external support, and do some kind of activism.
In hasbara, the government of Israel pays people to not form an organization, claim to have no external support, and do various kinds of pro-Israel and pro-Jew activism. This looks like astroturfing with the major vulnerability of the no-external-support claim shored up.
"The rules were you guys weren't going to fact check."
The instinct is about pointing out factual inaccuracies. What they wrote is either correct, or not. If it is not, and someone knows better they can and should point that out.
If you, or some other commenter, have a fuzzy feeling that google is worse than it used to be you are free to write that. You are perfectly entitled to that opinion. But you can't just make up false statements and expect to be unchallenged and unchallengeable on it.
There is a very strong stance on this site against talking about astroturfing, and I understand it. But for the life of me, I cannot figure out where this general type of sentiment originates. I don’t know any google enthusiasts and am not sure I’ve ever met one. It’s a fairly uncontroversial take on this website and in the tech world that google search has worsened (the degree of which is debateable). Coming out and saying boldly “no it isn’t, you’re lying” is just crazy weird to me and again I’m very curious where that sentiment comes from.
see some of the sibling and aunt/uncle comments in this thread to get at a little of what I’m talking about.
> If Google [had been] broken up 20 years ago [...] [e]veryone would still be paying for email.
Some people don't have the foggiest idea what they're talking about. But I don't really see that as suggesting they're part of an organized campaign.
I think Google search has gone downhill tremendously to the point of near uselessness and have been a Kagi subscriber for awhile, but I don't see astroturf in this instance. Do you have other examples?
I wasn't a fan for very long. Google got creepy fast, and at this point their search is becoming useless, but for a short time I really thought that Google is amazing and I was an enthusiast.
I believe I have covered that case in my comment. Let me quote the relevant part here for you: “What they wrote is either correct, or not. If it is not, and someone knows better they can and should point that out.”
That being said could you help me by pointing out the inaccuracy in jkaptur’s comment? It seems fairly simple and as far as I can see well supported by the source.
You can sealion with posts like this all you want but every time someone counters a post like this with ample evidence it gets group downvoted or ignored. You are also making an assertion that you’re free to back with evidence, that google and google products are not noticeably worse than 10 years ago.
here’s one study that says yes, it is bad:
https://downloads.webis.de/publications/papers/bevendorff_20...
Since we don’t have a time machine and can’t study the google of 2015 we have to rely on collective memory, don’t we? You proclaiming “it’s always been this way” and saying any assertion otherwise is false is an absolutely unfalsifisble statement. As I said, anyone over 25 knows.
Besides perusing the wealth of writing about this the last two years or so, in which the tech world at large has lamented at how bad search specifically has gotten - we also see market trends where people are increasingly seeking tools like chatGPT and LLM’s as a search replacement. Surely you, a thinking individual, could come to some pretty obvious conclusions as to why that might be, which is that google search has got a lot worse. The language models well known to make up stuff and people still are preferring them because search is somehow even less reliable and definitely more exhausting, and it was not always this way. If it was always this way, why are so many people turning to other tools?
Sounds like it should be very easy to counter their argument then.
For my education could you tell me which part of their message is inaccurate? The “Google was founded in 1998” or the “and you could buy ads on the search results page in 2000.” part?
> You are also making an assertion that you’re free to back with evidence, that google and google products are not noticeably worse than 10 years ago.
I did not make such an assertion. Where in my comment do you think i’m making that assertion?
> You proclaiming “it’s always been this way”
I’m sorry but who are you quoting? Did you perhaps misclicked which comment you wanted to respond to?
It quickly turned Google into the biggest / most valuable internet company of all time ever, and it still wasn't enough for them.
I've had adblockers running for as long as I can remember so I'm blisfully unaware of how bad it is now... mostly, I don't have adblockers on my phone and some pages are unusable.
Ads done right is the least bad way of supporting free stuff for people who don't want to pay the cost. But people with ubo punish all sites regardless of whether they do ads nicely or not.
You are right now writing in a thread about upcming future where promotion is embedded in the content so that content itself is one big ad disguised as whatever. Do you really think it's a better alternative to clearly delimited and unmistakeable ads?
I'd guess we're only 6-12 months out from a full advertisement takeover.
So you are just getting SEO'd pages (i.e ads) regurgitated to you.
Have you considered buying a ChatGPT filter/scrubber to clean your results? Only $9.99 a month! Not available in all areas, not legal in most of the world.
;) and </s>
Or support an open source AI model.
I stopped using ChatGPT when it started littering my conversation with emojis. It acts like one of those overzealous kids on Barney.
The goal isn't to have an ad->purchase, the goal is to make sure the purchase is more likely in the long term.
This is the machine that magicians program for: https://www.youtube.com/watch?v=wo_e0EvEZn8
I think if you had this incredible technology that could manipulate language to nudge readers in the softest possible way toward thinking a little bit more about buying some product, so that in aggregate you'd increase sales in a measurable way that nobody would ever notice, it would just quickly just devolve into companies demanding the phrase "BUY MORE REYNOLDS GARBAGE BAGS!!!!!!!!" at least 7 times.
I was going to write a rebuttal to this, about how more subtle forms of advertising are likely not very effective, and then I remembered subliminal advertising.
It's largely been banned (I think), but probably only because it's relatively easy to define and very easy to identify. In the case of LLMs, defining what they shouldn't be allowed to do "subliminally" will be a lot harder, and identifying it could be all but impossible without inside knowledge.
How effective is it? We don't know, but there is nothing of potential value to lose so nobody really cared. Just ban it and move on.
Oh come on.
Genuinely.
Come on.
Look at every single tech innovation of the last 20 years and say that again.
OTOH, there are far too many people who desperately want to believe in cool new stuff really being free, without any "gotcha" down the line.
I wish we wouldn't mindlessly repeat these platitudes. Try and falsify your statements before posting.
In our current regulatory and economic environment, it appears that mission-driven, long-term oriented, ethical companies are typically out-competed by finance-driven, short-term oriented, greedy companies.
The article describes the struggle of using a search engine in 2025. Which is to say, using Google in 2025. Search engines benefit greatly from huge economies of scale, and most websites are optimized for Google SEO and for their ad network. Sure, the folks at DuckDuckGo (my search engine) or Kagi appear to be your good sort of company, but the revenue and popularity of those companies is a rounding error in comparison to Alphabet, Inc. They can't afford the crawlers and infrastructure of the big finance-oriented players, they can't convince most websites to optimize for their engine, and most people don't even know they exist.
Sure, there's a handful of people running the equivalent of a small-town grocery with local farm-sourced produce and hand-selected general goods as a passion project, working long hours and slowly chewing through their savings. And there are a handful of people who feel that the existence of such a place is important, and shop there out of principles in spite of the incentives and penalties associated with that behavior. But most of the country is overrun by Dollar Generals and Wal Marts.
By labeling the salaries as R&D assets and amortizing that over 5 years (instead of taking it completely in the first year), they're more likely to make "accounting" profit and pay taxes in the early years.
Those legislative changes will likely move forward the taxes being paid.
But to your point: not paying taxes because a company is investing doesn't mean taxpayers are footing the bill. It does mean the company isn't contributing to paying taxes while it is in "growth investing" mode.
Maybe nvidia can be a winner selling shovels but it seems like everyone else will just be fighting each other in the massive pit they dug.
They don't need a winner, they want the race to continue as long as possible.
The question is, will AI chat or search ever be profitable? What enshittification will happen on that road? Will AIs be interrupting conversations to espouse their love of nordvpn or raid shadow legends?
Once the pressure to turn a profit is high enough the big players surely won't just leave that money on the table.
The scary part is that even if we end up paying for "ad-free" LLM services how do we really know if it is ad-free? Traditional services are (usually) pretty clear on what is an ad and what isn't. I wouldn't necessarily know if raid shadow legends really is the greatest game of all time or if the model had been tuned to say that it is.
For now Copilot is a fixed $20 / month / person, but it's only a matter of time before it becomes metered, or the advanced models cost more credits. This is also why they're pushing for agents, because a single query is cool and all, how much it costs in compute is reasonably predictable, but an agent can do a lot of interesting things and do 100x the usage of a single query, and put 100x the charge on the corporate credit card.
It'll probably have a chilling effect, with companies being like "ok maybe let's tone down a bit on the AI usage", just like how they hire consultants to bring down their runaway AWS costs.
A free lunch that costs our environment though, which is a big caveat :-)
But to be honest, optimizing monstrously slow processes that cost weeks of human labour by automating them, that saves a ton of energy as well. It’s not zero sum, as the humans spend that energy elsewhere, but ideally they spend it on more productive things.
This calculus can very quickly offset whatever energy is wasted generating cartoon images of vuvuzelas.
Yes, I do agree with this. However, that's only good as long as there wasn't a better way of optimizing them. Assuming we'd not be better off getting rid of those costly process altogether.
> ideally they spend it on more productive things
Same gotcha as mentioned in my other comment: "productive" in our growth economy often means "damaging to the environment", because we are collectively spending a lot of our time producing garbage and that's not something we should really optimize. Most of us work a fixed amount of hours so it's not like we are doing ourselves any favor by optimizing time in the end.
In another system, I wouldn't say. I'm generally for freeing up time for us so we can have better lives.
I'm not convinced. This article focuses on individual use and how inconsequential it is, but it seems like to me it dismisses the training part that it does mention a bit too fast to my taste.
> it’s a one-time cost
No, it's not. AI company constantly train new models and that's where the billions of dollars they get go into. It's only logical: they try to keep improving. What's more, the day you stop training new models, the existing models will "rot": they will keep working, but on old data, they won't be fresh anymore. the training will continue, constantly.
An awful quantity of hardware and resources are being monopolized where they could be allocated to something worthier, or just not allocated at all.
> Individuals using LLMs like ChatGPT, Claude, and Gemini collectively only account for about 3% of AI’s total energy use after amortizing the cost of training.
Yeah, we agree, running queries is comparatively cheap (still 10 times more than a regular search query though, if I'm to believe this article (and I have no reason not to)) after amortizing the cost of training. But there's no after, as we've seen.
As long as these companies are burning billions of dollars, they are burning some correlated amount of CO2.
As an individual, I don't want to signal to these companies, through my use of their LLMs, that they should keep going like this.
And as AI is more and more pervasive, we are going to start relying on it very hard, and we are also going to train models on everything, everywhere (chat messages, (video) calls, etc). The training is far from being a one shot activity and it's only going to keep increasing as long as there are rich believers willing to throw shit-tons of money into this.
Now, assuming these AIs do a good job of providing accurate answers that you don't have to spend more time on proofreading / double checking (which I'm not sure they always do), we are unfortunately not replacing the time we won by nothing. We are still in a growth economy, the time that is freed will be used to produce even more garbage, at an even faster rate.
(I don't like that last argument very much though, I'm not for keeping people busy at inefficient tasks just because, but this unfortunately needs to be taken in account - and that's, as a software developer, a harsh reality that also applies to my day to day job. As a software developer, my job is to essentially automatize tasks for people so they can have more free time because now the computers can do their work a bit more. But as a species, we've not increased our free time. We've just made it more fast-paced and stressful)
The article also mentions that there are other things to look into to improve things related to climate change, but the argument goes both ways: fighting against power hungry LLMs don't prevent you from addressing other causes.
But there are free (to copy) ones, and smaller ones. And while those were built from the large, expensive models, it's not clear if people won't find a way to keep them sustainable. We have at minimum gained a huge body of knowledge on "how to talk like people" that will stay there forever for researchers to use.
This is spot on. I think we'll be able to capitalize on other talents of "AI" once we recognize the big shift is done happening. It's like five years after the Louisiana Purchase: we have a bunch of new resources but we've barely catalogued them, let alone begun to exploit them.
> how long until all the LLM corporate initiatives die?
Sooner than I personally thought, and I place a lot of that with Apple's. They've led the way in hardware that supports LLMs, and I believe (hope?) they'll eventually wipe out most hosted chat-based products, leaving the corporate players to build APIs and embedded products for search, tech support, images, etc. The massive amounts of capital going into OpenAI, Anthropic, etc., will ebb as consumer demand falls.
I hope for this because the question I keep asking is, how can our energy infrastructure sustain the huge demand AI companies have without pushing us even further into a climate catastrophe?
one thing about LLMs used as a replacement for search is they have to be continually retrained or else they become stale. Lets say a hard recession hits and all the AI companies go out of business but we're left with all these models on huggingface that can still be used. Then, a new programming language hits the scene and it's a massive hit, how will LLMs be able to autocomplete and add dependencies for a language they've never seen before? Maybe an analogy could be asking an LLM to translate a written language you make up on the spot to English/other language.
if you consider the massive environmental harm AI has and continues to cause, the people whose work has been stolen to create it, the impacts on workers and salaries, and the abuses AI enables that free lunch starts looking more expensive.
> Last week, the Environmental Protection Agency (EPA) issued a rule clarification allowing the use of some mobile gas and diesel power sources for data centers. In a statement accompanying the rule, EPA Administrator Lee Zeldin claimed that the Biden administration's focus on addressing climate change had hampered AI development.
> "The Trump administration is taking action to rectify the previous administration's actions to weaken the reliability of the electricity grid and our ability to maintain our leadership on artificial intelligence," Zeldin said. "This is the first, and certainly not the last step, and I look forward to continue working with artificial intelligence and data center companies and utilities to resolve any outstanding challenges and make the U.S. the AI capital of the world."
https://www.newsweek.com/ai-race-fossil-powered-generators-a...
Here is another: https://www.utilitydive.com/news/trump-coal-executive-order-...
> In another move, DOE on Tuesday said it was offering loan guarantees for coal-fired power plant projects, such as upgrading energy infrastructure to restart operations or operate more efficiently or at a higher output.
Please elaborate.
Who are the “relevant technical authorities”?
""" [G]lobal data centre electricity use reached 415 TWh in 2024, or 1.5 per cent of global electricity consumption.... While these figures include all types of data centres, the growing subset of data centres focused on AI are particularly energy intensive. AI-focused data centres can consume as much electricity as aluminium smelters but are more geographically concentrated. The rapid expansion of AI is driving a significant surge in global electricity demand, posing new challenges for sustainability. Data centre electricity consumption has been growing at 12 per cent per year since 2017, outpacing total electricity consumption by a factor of four. """
The numbers are about data center power use in total, but AI seems to be one of the bigger driving forces behind that growth, so it seems plausible that there is some harm.
0: https://news.mit.edu/2025/explained-generative-ai-environmen... 1: https://www.itu.int/en/mediacentre/Pages/PR-2025-06-05-green... 2: (cf. page 20) https://www.itu.int/en/ITU-D/Environment/Pages/Publications/...
see also:
https://www.techrepublic.com/article/news-ai-data-centers-dr...
https://www.scientificamerican.com/article/a-computer-scient...
That seems like very low impact, especially considering training only happens once. I have to imagine that the ongoing cost of inference is the real energy sink.
I am totally on board with making sure data center energy usage is rational and aligned with climate policy, but "10k trips between LA and NY" doesn't seem like something that is just on its face outrageous to me.
Isn't the goal that these LLMs provide so much utility they're worth the cost? I think it's pretty plausible that efficiency gains from LLMs could add up to 10k cross USA trips worth of air pollution.
Of course this excludes the cost of actually running the model, which I suspect could be far higher
Data centers are already significant users of renewable electricity. They do not contaminate water in any appreciable amount.
"Stolen" is kind of a loaded word. It implies the content was for sale and was taken without payment. I don't think anyone would accuse a person of stealing if they purchased GRRM's books, studied the prose and then used the knowledge they gained from studying to write a fanfic in the style of GRRM (or better yet, the final 2 books). What was stolen? "the prose style"? Seems too abstract. (yes, I know the counter argument is "but LLMs can do more quickly and at a much greater scale", and so forth)
I generally want less copyright, not more. I'm imagining a dystopian future where every article on the internet has an implicit huge legal contract you enter into like "you are allowed to read this article with your eyeballs only, possibly you are also allowed to copy/paste snippets with attribution, and I suppose you are allowed to parody it, but you aren't allowed to parody it with certain kinds of computer assistance such as feeding text into an LLM and asking it to mimic my style, and..."
AI outputs copyrighted material: https://www.nytimes.com/interactive/2024/01/25/business/ai-i... and they can even be ranked by the extent to which they do it: https://aibusiness.com/responsible-ai/openai-s-gpt-4-is-the-...
AI is getting better at data laundering and hiding evidence of infringement, but ultimately it's collecting and regurgitating copyrighted content.
"even" is odd there, of course Disney is accusing them of violating copyright, that's what Disney does.
> AI is getting better at data laundering and hiding evidence of infringement, but ultimately it's collecting and regurgitating copyrighted content.
That's not the standard for copyright infringement; AI is a transformative use.
Similarly, if you read a book and learn English or facts about the world by doing that, the author of the book doesn't own what you just learned.
Establishing an affirmative defense that it's transformative fair use would hopefully be an uphill battle, given that it's commercial, using the whole work, and has a detrimental effect on the market for the work.
Reproducing a movie still well enough that I honestly wouldn't know which one is the original is transformative?
If I download all content from a website that has a use policy stating that all content is owned by that website and can't be resold. Then allow my users to query this downloaded data and receive a detailed summary of all related content, and sell that product. Perhaps this is a violation of the use policy.
All of this hasn't been properly tested in the courts yet.. large payments have already been made to Reddit to avoid this, likely because Reddit has the means to fight this in court.. my little blog though, fair game because I can't afford to engage.
You’re talking about overt infringement, the GP was talking about covert infringement. It’s difficult to see how something could be covert yet not transformative.
that's literally what happened in innumerable individual cases, though.
Much of the content that is created by people is done so to generate revenue. They are denied that revenue when people don't go to their site. One might interpret that as theft. In the case of GRRM's books - I would assumed they were purchased and the author received the revenue from the sale.
Also, LLMs don’t just imitate style, they can be made to reproduce certain content near-verbatim in a way that would be a copyright violation if done by a human being.
You can excuse it away if you want with reduction ad absurdum arguments, but the impact is distinctly different, and calls for different parameters.
It's just too dangerous to leave it in the hands of people who don't believe in science, and value money, power, and ideology more than anything else.
Its happening now, and there is nothing to stop it happening again in future.
— https://andymasley.substack.com/p/individual-ai-use-is-not-b...
> What’s the carbon footprint of using ChatGPT?
— https://www.sustainabilitybynumbers.com/p/carbon-footprint-c...
It's big, but it's honestly not that big. Most importantly, costs will quickly come down as we realize the limits of the models, the algorithms are optimized and even more-dedicated hardware is built. There's no reason to think it isn't sustainable, it will add up just fine.
But yes, it will attract a ton of advertising, the same curve every service goes through, like Google Search, YouTube, Amazon, etc. Still, just like Google and Amazon (subtly) label sponsored results, I expect LLM's to do the same. I don't think ads will be built into the main replies, because people will quickly lose trust in the results. Rather they'll be fed into a separate prompt that runs alongside the main text, or interrupts it, the way ads currently do, and with little labels indicating paid content. But the ads will likely be LLM-generated.
This is honestly why I struggle to get excited for anything in our industry anymore. Whatever it is it just becomes yet another fucking vector for ad people to shove yet more disposable shit in front of me and jingle it like car keys to see if I'll pull out a credit card.
The exception being the Steam Deck, though one could argue it's just a massive loss-leader for Steam itself and thus game sales (though I don't think that would hold up to scrutiny, it's pretty costly and it's not like Valve was hurting for business but anyway) but yeah. LLMs will absolutely do the exact same, and Google's now fully given up on making search even decent, replacing it with shit AI nobody asked for that will do product placements any day now, I would bet a LOT of money on it.
> I don't think ads will be built into the main replies,
> because people will quickly lose trust in the results.
The 'best' ads will be those the public doesn't recognize. Surf the internet without an ad blocker and you will die from a heart attack. This is a matter of conditioning users. It will take some time. Case in point: people already give up on privacy because "Google knows about everything already", which reflects a normalization of abuse, as we started from trust and norms ("don't be evil").So, can they? yes. Will they? yes.
AI is not a consumer product.
Businesses will pay for AI. They will use it for whatever they are building. We will buy what those businesses build.
The medium term here is that AI is going to become part of the value chain. It's gonna be like stripe or insurance or labor.
AI companies want the cultural shift, i.e. get everyone used to having their data, art, work, etc, turned into models. Plus they want the PR, i.e. AI agents to be seen as helpful, friendly and genuinely useful. They want this to happen fast and before legislators react too. Releasing for free seems safe and efficient.
Once enough people become like this, they will gladly pay to keep it, they'll consider it a basic necessity.
The internet has become total garbage now all because a few men wanted to make a bunch of money by making silicon do the thinking for them.
AI is the quickest route to ruin and ending up with humans like in Idiocracy devoid of critical thinking, and the output of LLMs is so bad to read, students are just turning in the worst papers using LLMs and learning nothing.
At first my school banned use of them but then Microsoft tipped their hand because they donate a lot of money and now everyone is allowed to use AI and they got rid of the requirement to use MLA Citations so everything turned into slop.
I remember when search was a free ride. The articles that I found in searches where relevant and there was no wordy boiler plate AI content specifically designed to get me to see all the advertising on the page. There is no free ride - AI will accelerate the enshittification of the Web by orders of magnitude. Barriers to garbage content generation are rapidly approaching 0.
How much would AirBnB pay for the intelligence everyone gets all their info from having a subtle bias like this? Sliiightly more likely to assume folks will stay in airbnbs vs a hotel when they travel, sliiightly more likely to describe the world in these terms.
How much would companies pay to directly, methodically and indetectably bias “everyone’s most frequent conversant” toward them?
This would be a very impressive technical feat
They seem to generate extremely specific websites and content for every conceivable search phrase. I'm not even sure what their end goal is since they aren't even always riddled with affiliate links.
Sometimes I wonder if the AI companies are generating these low-quality search results to drive us to use their LLMs instead.
Presumably the goal is to build up a positive-ish reputation, before they start trying to monetize it. Or perhaps to sell the site itself for someone else to monetize, on the basis of the number of clicks it's getting per month.
Oh, you sweet summer child...
I have family that used to be in charge of dealing with institutional corruption. In particular, public service corruption.
It's bad. Very, very bad. When "public servants" are paid less than their private counterparts, are routinely treated like crap by their employers, as well as those they serve, and they are in charge of services that could be incredibly lucrative to others, you're guaranteed to get corruption.
"Let's just use AI!" is the rallying cry.
Now, let's examine a scenario, where the folks that can make money from the service, also run the tools that implement the service...
ook
I tried to search the full name of a specific roof company in my area in quotes, and they weren't in the first page of results. But I got so many disclosed and not disclosed ads for OTHER contractors.
SEO has turned search engines into a kind of quasi-mafia "protection" racket.. "oh you didn't pay your protection fee, wouldn't it be a shame if something happened to your storefront?"
Today, I can't watch any TV without immediately realizing that every face I see on TV is forced to sell their expression and talk. They are basically selling, not expressing their true feelings. Every great movie, actor, great singer, great anchor - everyone. There is nothing natural in human interactions any more.
I wish I could violently shake every internet user while yelling "If you are not paying money for it, you cannot complain about it"
The librarian is selling you a vuvuzela because that is the only way the library has been able to keep the lights on. They offered a membership but people flipped out "Libraries are free! I never had to pay in the past! How dare you try and take my money for a free service!". They tried a "Please understand the service we provide and give a donation" but less than 2% of people donated anything. Never mind that there is a backdoor that you can use, allowing you to never need to interact with a librarian while fully utilizing the libraries services (that the library still pays for).
The internet was ruined by people unwilling to pay for it. And yes, I know the internet was perfect in 1996, I have a pair of rose colored glasses too.
A couple months ago, I spent a week or two writing some shell scripts to exhaustively mine one of those pdf hosting companies, looking for digital copies of Paste magazine. I only became aware that they might still exist after having spent at least a week trudging through Wayback Machine's archives of the old Paste website. I think I managed to get 8 or 9 issues total.
Search is dead. There was a time when I could probably have found those with a careful Google search in under an hour.
I've found AI helpful for answering questions, but better at plausibly answering them, I still end up checking links to verify what was said and where it's sourced from. It saves frustration but not really time.
boothby•1d ago
For a few months, I've been wondering: how long until advertisers get their grubby meathooks into the training data? It's trivial to add prompts encouraging product placement, but I would be completely shocked if the big players don't sell out within a year or two, and start biasing the models themselves in this way, if they haven't already.
micromacrofoot•1d ago
quesera•20h ago
dhosek•1d ago
huskyr•1d ago
nperez•1d ago
rusk•1d ago
This will distress the big players who want an open field to make money from their own adulterated inferior product so home grown LLM will probably end up being outlawed or something.
otabdeveloper4•1d ago
E.g., I'm sure people will pay for an LLM that plays Magic the Gathering well. They don't need it to know about German poetry or Pokemon trivia.
This could probably done as LoRAs on top of existing generalist open-weight models. Envision running this locally and having hundreds of LLM "plugins", a la phone apps.
rolandog•1d ago
I'm really looking forward to something like a GNU GPT that tries to be as factual, unbiased, libre and open-source as possible (possibly built/trained with Guix OS so we can ensure byte-for-byte reproducibility).
J_McQuade•1d ago
"... but if you used our VC's latest beau, BozoDB, it could be written like THIS! ... ..."
9 months, max. I give it 9 months.
mike_ivanov•13h ago
moron4hire•1d ago
WesolyKubeczek•1d ago
pbhjpbhj•1d ago
Lu2025•21h ago
hnbad•1d ago
This doesn't have to be as blunt as promoting specific libraries or services and it's a bias that could even be introduced "accidentally".
morkalork•1d ago
paulgerhardt•1d ago
zerocrates•1d ago
kijin•1d ago
If you're someone like Marlboro or Coca-Cola, on the other hand, it might be worth your while to pollute the training data and wait for subtle allusions to your product to show up all over the place. Maybe they already did, long before LLMs even existed.
rightbyte•23h ago
robocat•13h ago
Your product placement is appropriately ironic.
gofreddygo•1d ago
You're so right. it's not an if anymore, but when. and when it does, you wouldn't know what's an ad and what isn't.
In recent years i started noticing a correlation between alcohol consumption and movies. I couldn't help but notice how many of the movies I've seen in the past few years promote alcohol and try to correlate it with the good times. how many of these are paid promotions? I don't know.
and now, after noticing, this every movie that involves alcohol has become distasteful for me mostly because it casts a shadow on the negative side of alcohol consumption.
I can see how ads in an LLM can go the same route, deeply embedded in the content and indistinguishable from everything else.
HSO•1d ago
WesolyKubeczek•1d ago
And there are countless books and movies where the hero has drinks, or routinely swigs some whisky-grade stuff from a flask on his belt to calm his nerves, then drives.
chgs•1d ago
WesolyKubeczek•23h ago
suddenlybananas•1d ago
aleph_minus_one•1d ago
This depends a lot on the person. I, for example, would much more associate "reading scientific textbooks/papers" with having a good time. :-D
suddenlybananas•1d ago
[1] https://plato.stanford.edu/entries/generics/
immibis•1d ago
rightbyte•1d ago
graemep•1d ago
I would correct it to anti-alcohol sentiment being ingrained in American culture (as it is in some others, such as the Middle East) rather than western culture. Its an American hang-up, as with nudity etc.
zdragnar•21h ago
It wasn't enough to kill alcohol consumption entirely, but it did cut back on the culture of overindulgence as measure by death rates before and the years after.
Other countries also banned alcohol in this time period, and new Zealand voted for it twice but never enacted the ban.
immibis•16m ago
Lu2025•23h ago
aardvarkr•1d ago
galaxyLogic•1d ago
immibis•1d ago
pbhjpbhj•1d ago
myaccountonhn•21h ago
shwouchk•1d ago
rodgerd•1d ago
vintermann•1d ago
reubenmorais•1d ago
sph•1d ago
junga•1d ago
Parae•1d ago
NoMoreNicksLeft•19h ago
If the bait that they used to bring you to them so they could sell your eyeballs has finally started to rot and stink, then why do people continue to be attracted by it? You claim they've ruined their core product, but it still works as intended, never mind that you've confused what their products actually are.
willvarfar•1d ago
jedbrooke•1d ago
collingreen•1d ago
isoprophlex•1d ago
jedbrooke•10h ago
dizhn•1d ago
carlosjobim•23h ago
"Turn left at McDonalds" is what a normal person would say if you asked for directions in a town you don't know. Or they could say "Turn left at McFritzberger street", but what use would that be for you?
Although I've had Google Maps say "Turn right after the pharmacy", and there's three drug stores in the intersection...
jerf•22h ago
I'm also not particularly convinced any advertisers would pay for "Hey, we're going to direct people to just drive by your establishment, in a context where they have other goals very front-and-center on their mind. We're not going to tell them about the menu or any specials or let you give any custom messages, just tell them to drive by." Advertisers would want more than just an ambient mentioning of their existence for money.
There's at least two major classes of people, which are, people who take and give directions by road names, and people who take and give directions by landmarks. In cities, landmarks are also going to generally be buildings that have businesses in them. Before the GPS era, when I had to give directions to things like my high school grad party to people who may never have been to the location it was being held in, I would always give directions in both styles, because whichever style may be dominant for you, it doesn't hurt to have the other style available to double-check the directions, especially in an era where they are non-interactive.
(Every one of us Ye Olde Fogeys have memories of trying to navigate by directions given by someone too familiar with how to get to the target location, that left out entire turns, or got street names wrong, or told you to "turn right" on to a 5-way intersection that had two rights, or told you to turn on to a road whose sign was completely obscured by trees, and all sorts of other such fun. With GPS-based directions I still occasionally make wrong turns but it's just not the same when the directions immediately update with a new route.)
jedbrooke•10h ago
I still prefer street names since those tend to be well signed (in my area anyway) and tend not to change, whereas the business on the corner might be different a few years from now.
drdrek•23h ago
Advertisers and spammers have the highest possible incentive to subvert the system, so they will. Which is only one step worse (or better depending on your view) than letting a mega corp control all the flow of information absolutely.
Welcome to the new toll booth of the internet, now with 50% less access to the source material (WOW!), I hope you have a pleasant stay.
Lu2025•21h ago
This kind of ads is also impossible to filter. Everyone complains about ads on YouTube or Reddit but I never see any with my adblocks. Now we won't be able to squash them.