frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Crustdata (YC F24) – Web Search API for Token-Efficient AI Agents

https://crustdata.com
4•loondri•1h ago
Hi HN! We’re Abhilash Chowdhary, Chris Pisarski and Manmohit Grewal. We built Crustdata (YC F24). Today we’re launching our web search API for AI agents, which not only returns the most relevant documents from the web but also maps them to the correct entity (person, company or event). Demo video here https://youtu.be/IouWW97hBN8

If you run agents at scale, tokens become a line item. The web data is the worst input: long pages, repeated content, mixed entities, stale claims. The usual web search -> scrape -> summarize + structure forces the agent to spend tokens doing janitorial work before it can take action.

We’re trying to move that work upstream. We keep a canonical graph (ontology) of people and companies: stable internal IDs, aliases, and relationships. Then we continuously index the web and attach each document to the right entity ID. Example: raw web search for "Stripe pricing changes 2026" returns ~10 results across ~4,000 tokens, mostly redundant. We return 6 deduplicated results in ~1,200 tokens.

This is not just about saving tokens. It also matters because the common failure isn’t “search missed something.” It’s “search found something about the wrong entity.” Names collide. Companies rebrand. Domains move. Press releases get syndicated and look like independent sources. If you treat strings as IDs, you eventually attach evidence to the wrong person/company and the agent takes a confident action based on that mistake.

Under the hood, we run a continuous pipeline that updates the entity-linked index: discover -> fetch -> extract -> dedupe -> entity resolution -> attach -> index . And we serve you this index via our search API.

We didn’t start with web search. We spent ~2 years building verified people + company data from higher-trust sources. That forced us to build identity as a system, not a string. When we tried to bolt on web search and started building our integrated index of documents + people + companies, we ended up with a pile of local fixes: parser tweaks, domain rules, prompt hacks. Each fix helped one case and broke another because identity isn’t local. That’s when we committed to an entity-first index: pay the entity resolution cost once, then reuse it everywhere.

If you’re building AI agents for sales, recruiting, or investing that do a lot of web searches for people and companies, we’d love for you to try our web search APIs. https://crustdata.com/demo

Tech Firms Aren't Just Encouraging Their Workers to Use AI. They're Enforcing It

https://www.wsj.com/tech/ai/tech-firms-arent-just-encouraging-their-workers-to-use-ai-theyre-enfo...
1•1vuio0pswjnm7•2m ago•0 comments

Show HN: A free tool to turn your boring screenshots brutalist in seconds

https://neo.retroui.dev
2•devarifhossain•7m ago•0 comments

Monty's Gauntlet

https://tinkerdeck.com/projects/monty_hall_quiz
1•pcooper•9m ago•0 comments

How to give your AI real personality

https://medium.com/@empadev64/building-your-first-ai-persona-with-pythons-personaut-pdk-76ea45c04004
1•sarahoates•15m ago•0 comments

DeepSeek-v3.2 on GB300: Performance Breakthrough

https://blog.vllm.ai/2026/02/13/gb300-deepseek.html
1•roody_wurlitzer•16m ago•0 comments

Faaaah on Fail – VSCode Extension to Play "Faaaah" Sound on Test Failure

https://marketplace.visualstudio.com/items?itemName=Mastersam.faaaah-on-fail
1•vednig•16m ago•1 comments

Quotes from Moral Mazes (2019)

https://thezvi.wordpress.com/2019/05/30/quotes-from-moral-mazes/
1•Tomte•20m ago•0 comments

A Guide to Baker's Dozenal

https://tangerines.neocities.org/bakersdozenal
1•dmbche•21m ago•0 comments

GPT-5.3-Codex

https://openai.com/index/introducing-gpt-5-3-codex/?1
1•roody_wurlitzer•22m ago•0 comments

Oxfmt Beta

https://oxc.rs/blog/2026-02-24-oxfmt-beta
1•freddydumont•22m ago•0 comments

Anthropic Adds Caveat to AI Safety Policy in Race Against Rivals

https://www.bloomberg.com/news/articles/2026-02-25/anthropic-adds-caveat-to-ai-safety-policy-in-r...
2•KerrickStaley•30m ago•1 comments

BAFTAs Incident

https://en.wikipedia.org/wiki/John_Davidson_(activist)
1•andsoitis•32m ago•0 comments

A CLI tool to manage the browser history

https://github.com/odysa/histctl
2•agentforce•33m ago•0 comments

Show HN: A non-programmer built a blockchain ecosystem using only AI

1•BizinikiwiBrain•33m ago•0 comments

Witches, Nazi collaborators, banned books: International Booker prize 2026 list

https://www.theguardian.com/books/2026/feb/24/ravn-kehlmann-genberg-enard-and-cabezon-camara-long...
3•andsoitis•39m ago•0 comments

Show HN: Parallel AI agents that research a stock simultaneously

https://dapto.ai
1•sharmasachin98•40m ago•0 comments

Amazon AI lab chief to depart amid leadership shake-up

https://www.ft.com/content/d72bde2b-4f1a-4f0a-a747-03244c4d06ac
3•petethomas•41m ago•0 comments

How Many Crimes Are There, and Why Does It Matter?

https://broodingomnipresence.substack.com/p/how-many-crimes-are-there-and-why
1•Ariarule•43m ago•0 comments

Show HN: I built a hitman for rogue agents: dead man's switch and spend controls

https://twitter.com/JackDavis720/status/2026304535686222093
1•JackDavis720•44m ago•0 comments

Most teens believe their peers are using AI to cheat in school

https://www.washingtonpost.com/technology/2026/02/24/pew-teens-ai-cheating-school/
2•1vuio0pswjnm7•45m ago•0 comments

Show HN: I Made Siri for LeetCode

https://leetduck.com/
1•collinboler2•45m ago•0 comments

Memrail launches decision infrastructure, introduces decision plane

https://www.thefloridatribune.com/article/895167532-cadenzai-launches-memrail-decision-infrastruc...
1•amatlas•46m ago•0 comments

vLLM (high-throughput LLM serving engine)

https://github.com/vllm-project/vllm
1•roody_wurlitzer•48m ago•0 comments

Show HN: Open-source self-hostable backend – try to break my live instance (48h)

1•ravikantsaini•52m ago•2 comments

Aitracker – Track Claude, Codex, Gemini usage and costs from your terminal

https://github.com/j0nl1/aitracker
1•j0nl1•52m ago•1 comments

Double-buffering for LLM context windows: seamless handoff at zero extra cost

https://marklubin.me/posts/hopping-context-windows/
1•mlubin01•52m ago•1 comments

Use a SaaS Boilerplate to Ship Faster

https://launchsaas.org/blog/why-use-saas-boilerplate-ship-faster
1•victorymakes•55m ago•2 comments

Tuna: A new, modern, modal launcher for macOS

https://tunaformac.com
3•inatreecrown2•56m ago•2 comments

Ask HN: Share your productive usage of OpenClaw

2•aavci•56m ago•0 comments

Show HN: Shed – Run commands over HTTPS instead of SSH (lambdas / containers)

https://github.com/Oranda-IO/Shed
1•orandaio•57m ago•0 comments