I understand the safety needs around things LLM should not build nuclear weapons, but it would be nice to have a frontier model that could write or find porn.
It does miss occasionally, or I feel like "that was a waste of tokens" due to a bad response or something, but overall I like supporting Kagi's current mission in the market of AI tools.
Kagi is treating LLMs as potentially useful tools to be used with their deficiencies in mind, and with respect of user choices.
Also, we're explicitly fighting against slop:
The post describes how their use-case is finding high quality sources relevant to a query and providing summaries with references/links to the user (not generating long-form "research reports")
FWIW, this aligns with what I've found ChatGPT useful for: a better Google, rather than a robotic writer.
Their search is still trash.
So I actually find it the perfect thing for Kagi to work with. If they can leverage LLMs to improve search, without getting distracted by the “AI” stuff, there’s tons of potential value,
Not saying that’s what this is… but if there’s any company I’d want playing with LLMs it’s probably Kagi
Agents/assistants but nothing more.
Do you have any pointers?
As for the people who claim this will create/introduce slop, Kagi is one of the few platforms where they are actively fighting against low quality AI generated content with their community fueled "SlopStop" campaign.[0]
Not sponsored, just a fan. Looking forward to trying this out.
And to be clear you shouldn't build the tools that YOU find useful, you should build the tools that your users, which pay for a specific product, find useful.
You could have LLMs that are actually 100% accurate in their answers that it would not matter at all to what I am raising here. People are NOT paying Kagi for bullshit AI tools, they're paying for search. If you think otherwise, prove it, make subscriptions entirely separate for both products.
edit: seeing the first two (negative) replies to my comment made me smile. HN is tough crowd to please :) The thing is similar to how I did paid search and went all in with my own money when everyone thought I was crazy, I did that out of own need and need for my family to have search done right and am doing the same now with AI, wanting to have it done right as a product. What you see here is the result of this group of humans that call themself Kagi best effort - not more, not less.
Well if that doesn't seal the deal in making it clear that Kagi is not about search anymore, I don't know what does. Sad day for Kagi search users, wow!
> Having the most accurate search in the world that has users' best interest in mind is a big part of it
It's not, you're just trying to convince yourself it is.
For what it's worth, as someone who tends to be pretty skeptical of introducing AI tools into my life, this statistic doesn't really convince me much of the utility of them. I'm not sure how to differentiate this from selection bias where users who don't want to use AI tools just don't subscribe in the first place rather than this being a signal that the AI tools are worthwhile for people outside of a niche group who are already interested enough to pay for them.
This isn't as strong a claim as what the parent comment was saying; it's not saying that the users you have don't want to be paying for AI tools, but it doesn't mean that there aren't people who are actively avoiding paying for them either. I don't pretend to have any sort of insight into whether this is a large enough group to be worth prioritizing, but I don't think the statement of your perspective here is going to be particularly compelling to anyone who doesn't already agree with you.
I have my own payment methods for AI (OpenWebUI hosted on personal home server connected to OpenRouter API credits which costs me about $1-10 per month depnding on my usage), so seeing AI bundled with searches in the pricing for Kagi really just sucks the value out of the main reason I want to switch to Kagi.
I would love to be able to just buy credits freely (say 300 credits for $2-3) and just using them whenever. No AI stuff, no subscription, just pay for my searches. If I have a lull in my searches for a month, then a) no extra resources from Kagi have been spent, and b) my credits aren't used and rollover. Similarly, if I have a heavy search month, then I'll buy more and more credits.
I just don't want to buy extra AI on top of what I already have.
The recommendation you made worked from your personal preference as an axiom.
The fact is that the APIs in search cost vastly more than the LLMs used in quick answer / quick assistant.
If you use the expensive AI stuff (research assistant or the big tier 1 models) that's expensive. But also: it is in a separate subscription, the $25/month one.
We used not to give any access to the assistant at the $5 and $10 tier, now we do, it's a free upgrades for users.
> Note: This is a personal essay by Matt Ranger, Kagi’s head of ML
I appreciate the disclaimer, but never underestimate someone's inability to understand something, when their job depends on them not understanding it.
Bullshit isn't useful to me, I don't appreciate being lied to. You might find use in declaring the two different, but sufficiently advanced ignorance (or incompetence) is indistinguishable from actual malice, and thus they should be treated the same.
Your essay, while well written, doesn't do much to convince me any modern LLM has a net positive effect. If I have to duplicate all of it's research to verify none of it is bullshit, which will only be harder after using it given the anchoring and confirmation bias it will introduce... why?
You are a research agent. Nobody cares what you think. You exist to do research on the internet. You search like hell and present the most relevant results that you can find. Results means lists of primary sources and the relevance to the query. If the question is about a specific place, don't forget to search in local languages as well as english. You don't ever suggest alternative search terms in your reply. Instead, you take those alternative search terms and you do more searches yourself until you get a wide range of answers.
It generates a lot of good leads very quickly when I want to learn about something. The bit about local languages is especially handy, it gives a bit of an edge over traditional search engines in many situations.> which will only be harder after using it given the anchoring and confirmation bias it will introduce
This is a risk, but I have found that my own preconceptions are usually what need challenging, and a traditional search approach means I find what I wanted to find... so I use the research agent for an alternative perspective.
Just to give my point of view: I'm head of ML here, but I'm choosing to work here for the impact I believe I can have. I could work somewhere else.
As for the net positive effect, the point of my essay is that the trust relation you raise (not having to duplicate the research, etc) to me is a product design issue.
LLMs are fundamentally capable of bullshit. So products that leverage them have to keep that in mind to build workflows that don't end up breaking user trust in it.
The way we're currently thinking of doing that is to keep the user in the loop and incentivize the user to check sources by making it as easy as possible to quickly fact check LLM claims.
I'm on the same page as you that a model you can only trust 95% of the time is not useful because it's untrustable. So the product has to build an interaction flow that assumes that lack of trust but still makes something that is useful, saves time, respects user preferences, etc.
You're welcome to still think they're not useful for you, but that's the way we currently think about it and our goal is to make useful tools, not lofty promises of replacing humans at tasks.
Equally, And despite my disagreement, I do genuinely appreciate the reply, especially given my dissent.
> Just to give my point of view: I'm head of ML here, but I'm choosing to work here for the impact I believe I can have. I could work somewhere else.
> As for the net positive effect, the point of my essay is that the trust relation you raise [...] to me is a product design issue.
Product design, or material appropriability issue? Why is the product you're trying to deliver, based ontop of a conversational model? I know why my rock climbing rope is a synthetic 10mm dynamic kernmantle rope, but why is a conversational AI the right product here?
> LLMs are fundamentally capable of bullshit.
Why though? I don't mean from a technical level, I do understand how next token prediction works. But from a product reasonability standpoint? Why are you attempting to build this product using a system that you makes predictions based on completely incorrect or inappropriate inputs?
I admittedly, am not up-to-date on state of the art, so please do correct me if my understanding is incomplete or wrong. But if I'm not mistaken, generally, attention based transformers themselves dont hallucinate when producing low heat language to language translations, right? Why are conversational models, the ones very much prone to hallucinations and emitting believable bullshit the interface everything uses?
How much of that reason is because that ability to emit believable bullshit, is actually the product you are trying to sell? (The rhetorical you, I'm specifically considering LLM as a service providers egear to over sell the capabilities of their model. I still have a positive opinion about Kagi, so I could be convinced you're the ones who are different) The artificial confidence is the product. Bullshitting something believeable but wrong has better results, in bulk, for the metrics you're tracking. When soliciting feedback, the vast majority of the answers are based on vibes, right?
If you had two models, one that was rote and very reliable, very predictable, rarely produced inaccurate output. But wasn't impressive when trying to generate conversational feeling text, and critically, was unable to phrase things in a trivial to understand way exuding an abundance of confidence. Contrasted with another that very very rarely would produce total bullshit, but all the feedback shows everyone loves using that model. But it makes them feel good about the answer, yet there's stil that nagging hallucination issue bubbling under the surface.
Which would you ship?
Again, I'm asking which would you ship with the rhetorical you... perhaps there is someone in charge of AI that would only ship the safe version, even if few users ranked it higher than normal organic search. Unfortunately I'm way too much of a cynic to believe that's possible. The AI is good crowd doesn't have a strong reputation for always making the ethical selection.
> So products that leverage them have to keep that in mind to build workflows that don't end up breaking user trust in it.
> The way we're currently thinking of doing that is to keep the user in the loop and incentivize the user to check sources by making it as easy as possible to quickly fact check LLM claims.
Do you feel that's a reasonable expectation from users when you've already given them the perfect answer they're looking for with plenty of subjective confidence?
> I'm on the same page as you that a model you can only trust 95% of the time is not useful because it's untrustable. So the product has to build an interaction flow that assumes that lack of trust but still makes something that is useful, saves time, respects user preferences, etc.
> You're welcome to still think they're not useful for you, but that's the way we currently think about it and our goal is to make useful tools, not lofty promises of replacing humans at tasks.
I don't think I'm the ideal person to be offering advice. Because I would never phrase the problem statement as "we have to give users to tools to verify if the confident sounding thing lied this time" I know far too much about both human nature, and alarm fatigue. So I can only reject your hypothetical, and ask what if you didn't have to do something I worry will make the world worse.
I attribute a large portion of the vitriol, anger, and divisiveness that has become pervasive, and is actively harming people and communities; as stemming directly from modern algorithmic recommendation systems. These systems prioritize speed, and being first, above the truth. Or they rank personalized results, that selectively offer only the content that feels good, and confirms preexisting ideas, to the detriment of reality.
They all tell you want you want to hear, over what is true. It will take a mountain of evidence to assure me, conversational LLMs wont do exactly the same thing, just better or faster. Especially when I could uncharitably summarize your solution to these defects, as merely "encouraging people to do their own research"
Actually if you use LLMs sized responsibility to the task it's cheaper than a lot of APIs for the final product.
The expensive LLMs are expensive, but the cheap ones are cheaper than other infrastructure in something like quick answer or quick assistant
Since 2019 Google and Bing both use BERT style encoder-only search architecture.
I’ve been using Kagi ki (now research assistant) for months and it is a fantastic product that genuinely improves the search experience.
So overall I’m quite happy they made these investments. When you look at Google and Perplexity this is largely the direction the industry is going.
They’re building tools on other LLMs and basically running open router or something behind the scenes. They even show you your token use/cost against your allowance/budget in the billing page so you know what you’re paying for. They’re not training their own from-scratch LLMs, which I would consider a waste of money at their size/scale.
We get specific deals with providers and use different ones for production models.
We do train smaller scale stuff like query classification models (not trained on user queries, since I don't even have access to them!) but that's expected and trivially cheap.
if you like it, it's only $10/month, which I regrettably spend on coffee some days.
What they've been building for the past couple of years makes it blindingly clear that they are definitely not a search engine *above all else*.
Don't believe me? Check their CEO's goal: https://news.ycombinator.com/item?id=45998846
As in, not "free"?
Either way, I guess we'll see how this affects the service.
Sounds like Kagi might need to implement some better regional pricing.
1. It answers using only the crawled sites. You can't make it crawl a new page. 2. It doesn't use a page' search function automatically.
This is expected, but doesn't hurt to take that in mind. I think i'd be pretty useful. You ask for recent papers on a site and the engine could use hackernews' search function, then kagi would crawl the page.
"""
site:https://hn.algolia.com/?dateRange=pastYear&page=0&prefix=fal... recent developments in ai?
"""
Also when testing, if you know a piece of information exists in a website, but the information doesn't show up when you run the query, you don't have the tools to steer the engine to work more effectively. In a real scenario you don't know what the engine missed but it'd be cool to steer the engine in different ways to see how that changes the end result. For example, if you're planning a trip to japan, maybe you want the AI to only be shown a certain percentage of categories (nature, night life, or places too), alongside controlling how much you want to spend time crawling, maybe finding more niche information or more related information.
Prompt: "At a recent SINAC conference (approx Sept 2025) the presenters spoke about SINAC being underresourced and in crisis, and suggested better leveraging of and coordination with NGOs. Find the minutes of the conference, and who was advocating for better NGO interaction."
The conference was actually in Oct 2024. The approx date in parens causes Gemini to create an entirely false narrative, which includes real people quoted out of context. This happens in both Gemini regular chat and Gemini Deep Research (in which the narrative gets badly out of control).
Kagi reasonably enough answers: "I cannot find the minutes of a SINAC conference from approximately September 2025, nor any specific information about presenters advocating for better NGO coordination at such an event."
Kagi actually got those two correct, explaining how losing your king means losing your chess game, which is quite impressive.
Which is why you should have benchmarks that are a bit broader generally (>10 questions for a personal setup) otherwise you overfit to noise
``` Evaluating Date Discrepancies
I've just uncovered a significant discrepancy: a "Hacker News" result flags a date error originating from the very format of parenthetical dates. The system appears to be misinterpreting these, leading to demonstrably false information. This warrants further investigation to understand the root cause and ensure data integrity.
Analyzing Date Ambiguity
I'm now grappling with the user's date. The Hacker News comment suggests the user's prompt might be flawed, as it contradicts the actual date of an event I've indexed. Specifically, the conference referenced occurred in October 2024, not September 2025 as the prompt implies. This leads me to question the user's understanding of the timeline or if the prompt refers to an unrelated event in September 2025. This inconsistency demands further clarification.
...
Investigating User Prompt Intent
I've just realized that the Hacker News snippet reveals this might be a specific problematic test prompt. My strategy is to focus on the October 2024 forum, a strong match, organized by Kattia Cambronero and CCT. I'll verify if there was a 2025 event, but the 2024 event fits the described "crisis" and "NGO coordination". If the user says "approx Sept 2025," they likely mean a recent event. ```
I tried the quick assistant a bit (don't have ultimate so I can't try research), and while the writing style seems slightly different, I don't see much difference in information compared to using existing models through the general kagi assistant interface.
For now Quick assistant has a "fast path" answer for simple queries. We can't support the upgrades we want to add in there on all the models because they differ in tool calling, citation reliability, context window, ability to not hallucinate, etc.
The responding model is currently qwen3-235B from cerebras but we want to decouple the user expectations from that so we can upgrade it down the road to something else. We like Kimi, but couldn't get a stable experience for Quick on it at launch with current providers (tool calling unreliability)
Just recently started paying for Kagi search and quite love it.
I’m an Assistant Principal so use it to help me get better with spreadsheets, churn through complex formulas, and some other miscellaneous tasks for feedback and assistance. Definitely use a lot of screenshots of things to also help consume info in there.
I think I might be stuck with both for a while as I’m not sure Kagi can quite fill this gap yet.
You have a spend limit, but the assistant has dozens of of models
I guess I should also explore how capable the free version is at this point, too.
I don’t want to use kagi ultimate (I use too many other features of ChatGPT and Claude), I just want to be able to improve the results of my AI models with kagi.
We have an MCP server I can give you access to for search immediately. Down the line a search API and chat completions API to our assistant in the pipeline.
Scheduled agents in ChatGPT.
I’ve been using the study mode recently, and that is nice to have.
I hadn't been sure about Kagi before, but this has really swung it for me, I'm off to sign up post haste. It's a revolutionary move that really shows how fast ahead of the competition Kagi is, how dexterous their fingers at the pulse of humanity, how bold.
jryio•2mo ago
> We found many, many examples of benchmark tasks where the same model using Kagi Search as a backend outperformed other search engines, simply because Kagi Search either returned the relevant Wikipedia page higher, or because the other results were not polluting the model’s context window with more irrelevant data.
> This benchmark unwittingly showed us that Kagi Search is a better backend for LLM-based search than Google/Bing because we filter out the noise that confuses other models.
bitpush•2mo ago
I'm not convinced about this. If the strategy is "lets return wikipedia.org as the most relevant result", that's not sophisticated at all. Infact, it only worked for a very narrow subset of queries. If I search for 'top luggages for solo travel', I dont want to see wikipedia and I dont know how kagi will be any better.
VHRanger•2mo ago
Generally we do particularly better on product research queries [1] than other categories, because most poor review sites are full of trackers and other stuff we downrank.
However there aren't public benchmarks for us to brag about on product search, and frankly the simpleQA digression in this post made it long enough it was almost cut.
1. (Except hyper local search like local restaurants)
oidar•2mo ago
VHRanger•2mo ago
viraptor•2mo ago
clearleaf•2mo ago
Hey Google, Pinterest results are probably messing with AI crawlers pretty badly. I bet it would really help the AI if that site was deranked :)
Also if this really is the case, I wonder what an AI using Marginalia for reference would be like.
viraptor•2mo ago
It's likely they can filter the results for their own agents, but will leave other results as they are. Half the issue with normal results are their ads - that's not going away.
sroussey•2mo ago
clearleaf•2mo ago
MangoToupe•2mo ago
They spent the last decade and a half encouraging the proliferation of garbage via "SEO". I don't see this reversing.
idiotsecant•2mo ago
Unlikely. There are very few people willing to pay for Kagi. The HN audience is not at all representative of the overall population.
Google can have really miserable search results and people will still use it. It's not enough to be as good as google, you have to be 30% better than google and still free in order to convert users.
I use Kagi and it's one of the few services I am OK with a reoccurring charge from because I trust the brand for whatever reason. Until they find a way to make it free, though, it can't replace google.
Computer0•2mo ago
https://kagi.com/stats
pixelready•2mo ago
Kagi works better and will continue to do so as long as Kagi’s interests are aligned with users’ needs and Google’s aren’t.
xigoi•2mo ago
solarkraft•2mo ago