frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Waiting for dawn in search: Search index, Google rulings and impact on Kagi

https://blog.kagi.com/waiting-dawn-search
95•josephwegner•2h ago

Comments

whs•2h ago
>Google: Google does not offer a public search API. The only available path is an ad-syndication bundle with no changes to result presentation - the model Startpage uses. Ad syndication is a non-starter for Kagi’s ad-free subscription model.[^1]

>Because direct licensing isn’t available to us on compatible terms, we - like many others - use third-party API providers for SERP-style results (SERP meaning search engine results page). These providers serve major enterprises (according to their websites) including Nvidia, Adobe, Samsung, Stanford, DeepMind, Uber, and the United Nations.

The customer list matches what is listed on SerpAPI's page (interestingly, DeepMind is on Kagi's list while they're a Google company...). I suppose Kagi needs to pen this because if SerpAPI shuts down they may lose access to Google, but they may already have utilize multiple providers. In the past, Kagi employees have said that they have access to Google API, but it seems that it was not the case?

As a customer, the major implication of this is that even if Kagi's privacy policy says they try to not log your queries, it is sent to Google and still subject to Google's consumer privacy policy. Even if it is anonymized, your queries can still end up contributing to Google Trends.

xnx•1h ago
> Because direct licensing isn’t available to us on compatible terms, we - like many others - use third-party API providers for SERP-style results

Crazy for a company to admit: "Google won't let us whitelabel their core product so we steal it and resell it."

direwolf20•1h ago
Pretty standard business practice though. There's no ethics in making money.
Ar-Curunir•1h ago
Strange to pick on Kagi when there's much bigger companies on that list.
xnx•17m ago
Those companies allegedly have used SerpAPI (probably to check visibility), but not to resell a Google Search knock-off.
shadowgovt•1h ago
But in this current climate, they can admit it and then dare Google to tell them to stop... After Google has just had an antitrust ruling against it for dominating the search market.

Google doesn't really have a leg to stand on and they know it.

techjamie•1h ago
What's the alternative? Building a competing search index as a relative nobody on the web is very difficult, from the outset, and is made more difficult from sites taking extra measures to stop bots in general now.

Google's crawler is given special privileges in this right and can bypass basically all bot checks. Anyone else has to just wade through the mud and accept they can't index much of the web.

eli•1h ago
Seems like an open question as to whether that violates any laws.

Another way to look at it is that if you publish a service on the web, you have limited rights to restrict what people do with it.

Isn't that the logic Google search relies on in the first place? I didn't give permission for Google to crawl and index and deep link to my site (let alone summarize and train LLMs on it). They just did it anyway, because it's on a public website.

direwolf20•1h ago
I hope they cache search results to further reduce the number of calls to Google.

And Marginalia Search was not mentioned? Marginalia Search says they are licensing their index to Kagi. Perhaps it's counted under "Our own small-web index" which is highly misleading if true.

packetlost•1h ago
The index is not necessarily the code, but the dataset. IMO it would be better to be more open about the technical stack, but I don't think this feels dishonest to me.
xnx•13m ago
> "Our own small-web index"

Has Kagi ever said what this is? I wouldn't be at all surprised if it is just kagi.com pages or a download of Wikipedia.

OGEnthusiast•1h ago
Sounds like we need a nationalized search engine company then?
browningstreet•1h ago
I wouldn't trust a nationalized search engine company.

That said, there are projects like Common Crawl and in Europe, Ecosia + Qwant.

I personally would like to see a search enginge PaaS and a music streaming library PaaS that would let others hook up and pay direct usage fees.

shadowgovt•1h ago
An interoperable search index access standard might work. We've done something similar for peering and the backbone of the IP-layer interconnects themselves.
NitpickLawyer•1h ago
> and in Europe, Ecosia

I tried. It's just not good enough. Quick example: yesterday I set up a workstation with Ubuntu, wanting to try out wayland. One of the things I wanted was to run an app (w/ gui) from another (unprivileged) user under my own user. Ecosia gave me bad old stuff. Tried for a few minutes, nothing useful. Switched to google, one of the first results was about waypipe. Searched waypipe on ecosia. 1 and a half pages of old content. Glaringly, not one of those results was the ubuntu.manpages entry on waypipe. shrug

ajdude•1h ago
Does anyone else use the phrase "I'm going to google XYZ" while referring to actually searching it up on Kagi, DDG, or another search engine?
chroma205•1h ago
> Does anyone else use the phrase "I'm going to google XYZ" while referring to actually searching it up on Kagi, DDG, or another search engine?

Not me. I only use Google.

Never used Kagi or DDG. Don’t care enough.

jeremyjh•1h ago
Yes, it’s like Xerox or Kleenex except it’s actually still a monopoly. In a happy Kagi user but I know hardly anyone else is.
dijksterhuis•1h ago
nope, i say “i’m going to search for XYZ” or similar
eli•1h ago
Ironically this is a bad thing for Google from a legal standpoint. If a term becomes "genericized" then it can lose trademark protection.

"Aspirin" is a famous example. It used to be a brand name for acetylsalicylic acid medication, but became such a common way to refer to it that in the US any company can now use it.

1-more•50m ago
Apparently the "lost in the Treaty of Versailles" explanation is a bit of a just-so story: https://history.stackexchange.com/questions/55729/why-did-ba...
pixl97•1h ago
Yes, but more in the past than now, simply because almost everybody seems to use google itself.

For example I'd hear people say "I'll Google that", then use Yahoo when they were still a major search engine.

shervinafshar•1h ago
I've been using Kagi for the past few years, but I try to use a brand-agnostic language talking about web search; e.g. "I'm gonna search [the web] for it"; "Use your favorite search engine to look it up".
kqr•53m ago
I used to. Even when I actually used DDG. Now that I use Kagi (and thus am on the second web search service after I stopped using Google) it started to feel silly so I say "search the web" these days.
dooglius•42m ago
Yeah, I don't feel the need to have conversations go on a tangent about explaining what Kagi is
bronson•24m ago
Now my family usually says "I'm going to ask AI."
hsuduebc2•1h ago
It is even worse that the Google search become shit in last years. So they gate keep only relevant information for themselves and not using them with intent to improve search quality. As always if you have no competition your innovation goes only towards cost reduction. Not product improvement.
warkdarrior•8m ago
If Google Search is shit, why does Kagi want access to it?
WhyNotHugo•1h ago
The statistics in this article sound like garbage to me.

Google used by 90% or the world?

~20% of the human population lives in countries where Google is blocked.

OTOH, Baidu is the #1 search engine in China, which has over 15% of the world’s population… but doesn’t reach 1%?

These stats are made measuring US-based traffic, rather than “worldwide” as they claim.

0x1ch•1h ago
Google is only blocked in places where it would already be hard for a company with morals to work in, if not outright blocked as well. This probably represents traffic globally, excluding those places.

Instead of downvoting blindly, please state which countries are currently blocking Google that would willingly allow Kagi, a AI/Privacy focused search engine company to exist in their domain? The results may surprise you!

lolc•59m ago
I guess they'd argue that the people in China don't count, because people in China don't get to choose Google. But yeah, the stats they use from "StatCounter" are clearly not representative for what the world uses.
elAhmo•9m ago
You can argue that people outside of China don't get to choose something other than Google. Sure, there are recent pushes with default search engine choices and similar initiatives, but there is a reason why Google is paying hundreds of millions of dollars to be the default search engine.
yomismoaqui•1h ago
One thing I have discovered after using AI chats that include a websearch tool is that I don't want to delve on diferent blogs, Medium posts, Stack overflow threads with passive-aggresive mod comments, dismissing cookie banners... Sorry I just want the info I'm looking for, I don't care for your personal expression or need to monetize your content.

There are other times (usually not work related) when I want to explore the web and discovering some nice little blog or special corner on the net. This is what my RSS feed reader is for.

kqr•51m ago
With Kagi you can opt in to an LLM summary of the search result by appending a question mark to the query. It's a neat mechanism when it works!
ghm2199•1h ago
> Building a comparable one from scratch is like building a parallel national railroad..

Not too be pedantic here but I do have a noob question or two here:

1. One is building the index, which is a lot harder without a google offering its own API to boot. If other tech companies really wanted to break this monopoly, why can't they just do it — like they did with LLM training for base models with the infamous "pile" dataset — because the upshot of offering this index for public good would break not just google's own monopoly but also other monopolies like android, which will introduce a breath of fresh air into a myriad of UX(mobile devices, browsers, maps, security). So, why don't they just do this already?

2. The other question is about "control", which the DoJ has provided guidance for but not yet enforced. IANAL, but why can't a state's attorney general enforce this?

hsuduebc2•42m ago
I don’t think it’s comparable to today’s AI race.

Google has a monopoly, an entrenched customer base, and stable revenue from a proven business model. Anyone trying to compete would have to pour massive money into infrastructure and then fight Google for users. In that game, Google already won.

The current AI landscape is different. Multiple players are competing in an emerging field with an uncertain business model. We’re still in the phase of building better products, where companies started from more similar footing and aren’t primarily battling for customers yet. In that context, investing heavily in the core technology can still make financial sense. A better comparison might be the early days of car makers, or the web browser wars before the market settled.

hamdingers•32m ago
> If other tech companies really wanted to break this monopoly, why can't they just do it

Google is a verb, nobody can compete with that level of mindshare.

wongarsu•23m ago
Xerox is a verb, but most copy machines I see are made by their competition
hamdingers•21m ago
Wonder why that could be?

https://www.nytimes.com/1975/07/31/archives/xerox-settlement...

eikenberry•22m ago
Kleenex isn't the only brand of tissues sold in stores.
Zyst•21m ago
So were AOL, and Skype
observationist•16m ago
A big part of it is about the legal minefield if you presented any sort of real threat to Google. Nobody wants to wager billions in infrastructure and IP against Google or Apple or Microsoft, even if you could whip up a viable competing product in a weekend (for any given product.)

Part of it is also the ecosystem - don't threaten adtech, because the wrong lawsuits, the wrong consumer trend, the wrong innovation that undercuts the entire adtech ecosystem means they lose their goose with the golden eggs.

Even if Kagi or some other company achieves legitimate mindshare in search, they still don't have the infrastructure and ancillary products and cash reserves of Google, etc. The second they become a real "threat" in Google's eyes, they'd start seeing lawsuits over IP and hostile and aggressive resource acquisitions to freeze out their expansion, arbitrary deranking in search results, possible heightened government audits and regulatory interactions, and so on. They have access to a shit ton of legal levers, not to mention the whole endless flood of dirty tricks money can buy (not that Google would ever do that.)

They're institutional at this point; they're only going away if/when government decides to break it up and make things sane again.

xnx•16m ago
> If other tech companies really wanted to break this monopoly, why can't they just do it

Companies would rather sue than try and compete by investing their own money.

walls•7m ago
A huge amount of the web is only crawlable with a googlebot user-agent and specific source IPs.
paxys•7m ago
Apple had a chance to break Google's search monopoly, but they chose to take billions from them instead.

Microsoft had a chance (well another chance, after they gave up IE's lead) to break up Google's browser monopoly, but they decided to use Chromium for free instead.

Ultimately all these decisions come down to what's more profitable, not what's in the best interests of the public. We have learned this lesson x1000000. Stop relying on corporations to uphold freedoms (software or otherwise), becuase that simply isn't going to happen.

the_arun•58m ago
If google is serving 90% traffic & others are unable to enter - Doesn't that mean google is doing something right for the customer and others are unable to outcompete it? Isn't this how life works?
CGMthrowaway•54m ago
Google is allowed to be big, be better and win users. But happy customers is not the full test of monopolization. The real question is, "Could a meaningfully better search engine realistically displace Google today?” If the answer is no, then competition is broken
xnx•9m ago
> "Could a meaningfully better search engine realistically displace Google today?”

ChatGPT clearly demonstrated that displacing Google is possible. All previous monopoly arguments seemed even more flimsy after that.

rafterydj•52m ago
This is a woefully naive view on the nature of monopolies. You could have made the same argument for Standard Oil.
soiltype•47m ago
...No. Not at all. Not in the case of Google and generally that's not "how life works". If it was true, why would Google spend so much money to be the default search engine in so many devices/browsers?
hamdingers•23m ago
Is the user's choice to use google a meaningful one when they're effectively the only game in town?
giantrobot•10m ago
Google must be right for the customer because Google pays billions of dollars to be the default search engine for all the major browsers. And end users are notorious for changing application defaults.
jeffbee•54m ago
"We will simply access the index" has always struck me as wild hand-waving that would instantly crumble at first contact with technical reality. "At marginal cost" is doing a huge amount of work in this article.
nige123•51m ago
The user data (anonymised) and analytics also needs to be shared.
user3939382•48m ago
For anyone not acquainted Kagi is excellent and the people who work there strike me as nice and competent. I’m a harsh critic usually. Highly recommended.
flkiwi•16m ago
I've gotten more value out of it than just about any ongoing subscription I have. It's clean, fast, deeply customizable (i.e., excluding "answers" websites or any other domain you never want to see again), and, for what it is, inexpensive. Honestly if Google (or Bing) worked like Kagi does, I'd trade some of the privacy for the utility.
ares623•48m ago
Kagi should start building an index of sites that are trying to escape the current slop internet. It’s know they have the Small Web thing. But I’d like to see an index of a “neo internet” that blocks Google et al.
WhereIsTheTruth•44m ago
Kagi's "waiting for dawn" is just waiting for Google to legitimize their reseller business

Meanwhile, users pay a premium to pretend they're not using Google

Fascinating delusion

b3kart•36m ago
> Meanwhile, users pay a premium to pretend they're not using Google

My searches can’t be tied to me by Google for their ad targeting: this is worth paying a premium for, and I am glad Kagi are providing this service.

You seem to have a very limited understanding of the value Kagi provides.

stephen_cagle•38m ago
One interesting point was the original PageRank algorithm greatly benefited from the fact that we kinda only had "text matching" search before Google (my memory was AltaVista at the time).

Because text matching was so difficult to search with, whenever you went to a site, it would often have a "web of trust" at the bottom where an actual human being had curated a list of other sites that you might like if you liked this site.

So you would often search with keywords (often literals), then find the first site, then recursively explore the web of trust links to find the best site.

My suspicion has always been that Google (PageRank) benefited greatly from the human curated "web of trust" at the bottom of pages. But once Google came out, search was much better, and so human beings stopped creating "web of trust" type things on their site.

I am making the point that Google effectively benefited from the large amount of human labor put into connecting sites via WOT, while simultaneously (inadvertently) destroying the benefit of curating a WOT. This means that by succeeding at what they did, they made it much more difficult for a Google#2 to come around and run the exact same game plan with even the exact same algorithm.

tldr; Google harvested the links that were originally curated by human labor, the incentive to create those links are gone now, so the only remaining "links" between things are now in the Google Index.

Addendum: I asked claude to help me think of a metaphor, and I really liked this one as it is so similar.

``` "The railroad and the wagon trails"

Before railroads, collective human use created and maintained wagon trails through difficult terrain. The railroad company could survey these trails to find optimal routes. Once the railroad exists, the wagon trails fall into disuse and the pathfinding knowledge atrophies. A second railroad can't follow trails that are now overgrown. ```

sabslikesobs•13m ago
I like that there's a list of primary sources at the bottom.

Kagi's AI assistant has been satisfying compared to Claude and ChatGPT, both of which insisted on having a personality no matter what my instructions said. Trying to do well-sourced research always pissed me off. With Kagi it gives me a summary of sources it's found and that's it!

Linux from Scratch

https://www.linuxfromscratch.org/lfs/view/stable/
48•Alupis•55m ago•8 comments

Show HN: ChartGPU – WebGPU-powered charting library (1M points at 60fps)

https://github.com/ChartGPU/ChartGPU
336•huntergemmer•4h ago•111 comments

TeraWave Satellite Communications Network

https://www.blueorigin.com/terawave
25•T-A•1h ago•3 comments

JPEG XL Test Page

https://tildeweb.nl/~michiel/jxl/
104•roywashere•3h ago•66 comments

PicoPCMCIA – a PCMCIA development board for retro-computing enthusiasts

https://www.yyzkevin.com/picopcmcia/
73•rbanffy•2h ago•20 comments

Skip Is Now Free and Open Source

https://skip.dev/blog/skip-is-free/
128•dayanruben•4h ago•34 comments

Show HN: Rails UI

https://railsui.com/
33•justalever•1h ago•27 comments

Hate is a strong word, but I don't like Windows 11

https://blog.urara.pl/hate-is-a-strong-word-but-i-really-really-really-don%27t-like-windows-11
25•todsacerdoti•1h ago•30 comments

Waiting for dawn in search: Search index, Google rulings and impact on Kagi

https://blog.kagi.com/waiting-dawn-search
97•josephwegner•2h ago•65 comments

Claude's New Constitution

https://www.anthropic.com/news/claude-new-constitution
89•meetpateltech•3h ago•47 comments

Autonomous (YC F25) is hiring – AI-native financial advisor at 0% advisory fees

https://atg.science/
1•dkobran•2h ago

Stanford scientists found a way to regrow cartilage and stop arthritis

https://www.sciencedaily.com/releases/2026/01/260120000333.htm
75•saikatsg•1h ago•19 comments

Nested Code Fences in Markdown

https://susam.net/nested-code-fences.html
144•todsacerdoti•6h ago•40 comments

SmartOS

https://docs.smartos.org/
128•ofrzeta•4h ago•52 comments

Beowulf's opening "What" is no interjection

https://www.poetryfoundation.org/poetry-news/69208/new-research-opening-line-of-beowulf-is-not-wh...
40•gsf_emergency_6•2d ago•23 comments

Tell HN: Amazon has deactivated my seller account. No idea how to move forward

38•hacky_engineer•1h ago•26 comments

Can you slim macOS down?

https://eclecticlight.co/2026/01/21/can-you-slim-macos-down/
85•ingve•11h ago•125 comments

How are you automating your coding work?

10•manthangupta109•26m ago•2 comments

Show HN: Company hiring trends and insights from job postings

https://jobswithgpt.com/company-profiles/
10•sp1982•1h ago•1 comments

RTS for Agents

https://www.getagentcraft.com/
85•summoned•5d ago•34 comments

Anthropic's original take home assignment open sourced

https://github.com/anthropics/original_performance_takehome
581•myahio•16h ago•302 comments

Slouching Towards Bethlehem – Joan Didion (1967)

https://www.saturdayeveningpost.com/2017/06/didion/
5•jxmorris12•2h ago•0 comments

EU–INC – A new pan-European legal entity

https://www.eu-inc.org/
618•tilt•8h ago•584 comments

Show HN: See the carbon impact of your cloud as you code

https://dashboard.infracost.io/
34•hkh•4h ago•9 comments

EmuDevz: A game about developing emulators

https://afska.github.io/emudevz/
161•ingve•3d ago•36 comments

Without benchmarking LLMs, you're likely overpaying

https://karllorey.com/posts/without-benchmarking-llms-youre-overpaying
95•lorey•1d ago•60 comments

Show HN: yolo-cage – AI coding agents that can't exfiltrate secrets

https://github.com/borenstein/yolo-cage
33•borenstein•4h ago•52 comments

I Made Zig Compute 33M Satellite Positions in 3 Seconds. No GPU Required

https://atempleton.bearblog.dev/i-made-zig-compute-33-million-satellite-positions-in-3-seconds-no...
100•signa11•9h ago•13 comments

Batmobile: 10-20x Faster CUDA Kernels for Equivariant Graph Neural Networks

https://elliotarledge.com/blog/batmobile
72•ipnon•3d ago•11 comments

Ireland wants to give its cops spyware, ability to crack encrypted messages

https://www.theregister.com/2026/01/21/ireland_wants_to_give_police/
171•jjgreen•5h ago•68 comments