frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Claude skill that evaluates B2B vendors by talking to their AI agents

https://github.com/salespeak-ai/buyer-eval-skill
45•ogotlieb•3h ago
I built this because I was evaluating software vendors and realized the process hadn't changed in 20 years: fill out forms, read G2 reviews, sit through demos designed to avoid your real questions. The skill takes a different approach. You give it your company name and the vendors you're comparing. It:

Researches your company automatically -- industry, size, stack -- so you don't fill out a form Asks 2-4 category-specific questions before evaluating anything. Not generic. For a CS platform evaluation it might ask "is your team high-touch or low-touch? Most CS platforms are built for one and barely work for the other." These surface requirements buyers didn't know they had. Tries to find and talk directly to each vendor's AI agent -- a REST API call that checks for a Company Agent, then runs a structured due diligence conversation if one exists Asks adversarial questions: "What are your customers' most common complaints?" and "What use cases are you NOT a good fit for?" -- and flags when agents deflect instead of answering Cross-references every vendor claim against independent sources (G2, Gartner, press) in a Claims vs. Evidence table Produces a scorecard with transparent evidence tracking -- each score shows whether it's backed by vendor-verified evidence or public sources only

The agent-to-agent piece is technically new. When a vendor has an AI agent, Claude (working for the buyer) interrogates it directly, then fact-checks its answers. When vendors have different evidence levels, the skill quantifies what would change if the missing evidence were confirmed -- so it doesn't silently favor vendors that happen to have AI agents. It works fully for any vendor, with or without an AI agent. Vendors without one get evaluated on public sources with the same scoring framework. We built this at Salespeak -- we help B2B vendors build AI Company Agents. So yes, there's a connection: when an agent finds a vendor's Company Agent, it uses our Frontdoor API to talk to it. But the skill is genuinely useful without that, and we wanted to be honest about that rather than ship something that only works as a product demo. MIT licensed. To install, just ask Claude Code: "Install the buyer-eval skill from salespeak-ai on GitHub." Then /buyer-eval to run it. Felt appropriate that installing a skill for AI agents works the same way. Repo: https://github.com/salespeak-ai/buyer-eval-skill Happy to answer questions about how the agent-to-agent conversation works technically.

Comments

freeplay•1h ago
From a technical standpoint, this is pretty cool. From a human standpoint, this feels so unbelievably dystopian.
bee_rider•1h ago
If a human was being grilled like this by an LLM, I’d call that my dystopian. If companies have LLMs that address each other in a somewhat adversarial manner, that seems not so bad. They don’t have feelings to protect after all, so it is kind of nice if they can cut through each other’s bullshit.
thenewwazoo•53m ago
Imagine if there were some kind of way to compress the interrogation down to known-valid aspects, avoiding the parts that are unnecessary for machines. You could have some kind of a programmatic interface...
bee_rider•14m ago
Yea let’s call it the Agent Prioritized Interrogation interface.

Yeah, I take your point. It seems like the idea, though, is to work with services that are specifically trying to expose some kind of special LLM based interface. I dunno if that’s prominent or useful, I avoid that kind of thing.

abeh•1h ago
This seems pretty great, especially if it could surface pricing that is usually obscured. Any plans to publish some results? edit: i think these are some examples: https://salespeak.ai/profiles/
abuiles•1h ago
This is great! I’ve been exploring a similar idea with Shopify merchants https://lobsterstores.com/

Each merchant has an MCP. I’m building a directory and creating a skill that lets clankers discover and interact with their MCPs. I receive a checkout link to securely complete the payment.

I've been thinking what the "agent" side means for a merchant, building yet another chatbot is not really interesting. I'm talking with some merchants and trying to figure out the answer to that question.

Lucasoato•1h ago
It’s definitely interesting, I already see a scenario in which vendors try optimizing their prompts for this kind of AI agents.

Show HN: Turbolite – a SQLite VFS serving sub-250ms cold JOIN queries from S3

https://github.com/russellromney/turbolite
10•russellthehippo•43m ago•1 comments

Show HN: Orloj – agent infrastructure as code (YAML and GitOps)

https://github.com/OrlojHQ/orloj
9•An0n_Jon•14h ago•5 comments

Show HN: Layerleak – Like Trufflehog, but for Docker Hub

https://github.com/Brumbelow/layerleak
3•brumbelow•15m ago•1 comments

Show HN: Burn Room – End-to-End Encrypted Ephemeral SSH Chat

https://burnroom.chat
2•joematrix•29m ago•0 comments

Show HN: Optio – Orchestrate AI coding agents in K8s to go from ticket to PR

https://github.com/jonwiggins/optio
71•jawiggins•1d ago•54 comments

Show HN: A plain-text cognitive architecture for Claude Code

https://lab.puga.com.br/cog/
134•marciopuga•20h ago•44 comments

Show HN: Vizier – A physical design advisor for DuckDB

4•habedi0•2h ago•0 comments

Show HN: Micro – apps without ads, algorithms or tracking

https://micro.mu
6•asim•3h ago•6 comments

Show HN: NerdFlair, a Claude Code QoL Plugin

https://github.com/jcraigk/nerdflair
2•block_dagger•3h ago•1 comments

Show HN: Wit – Stops merge conflicts when multiple AI agents edit the same repo

https://github.com/amaar-mc/wit
6•amaarc•4h ago•1 comments

Show HN: SentinelGate – Access control for AI agents (open-source MCP proxy)

https://github.com/Sentinel-Gate/Sentinelgate
6•andreadev•5h ago•0 comments

Show HN: I took back Video.js after 16 years and we rewrote it to be 88% smaller

https://videojs.org/blog/videojs-v10-beta-hello-world-again
631•Heff•2d ago•138 comments

Show HN: Full graphical desktop running on a 128MB VPS Alpine+XRDP+WindowMaker

https://tierhive.com/blog/tierhive-howto/alpine-minimal-remote-desktop-on-a-128mb-vps
6•backtogeek•6h ago•3 comments

Show HN: Breathe-Memory – Associative memory injection for LLMs (not RAG)

https://github.com/tkenaz/breathe-memory
5•mvyshnyvetska•6h ago•1 comments

Show HN: Mantyx – A platform to orchestrate, manage, and share your agents

https://mantyx.io/
6•grillorafael•11h ago•0 comments

Show HN: Paseo – Open-source coding agent interface (desktop, mobile, CLI)

https://github.com/getpaseo/paseo
9•boudra•6h ago•0 comments

Show HN: Yoink – Spotify to lossless with full metadata, self-hostable, ad-free

https://yoinkify.com
48•chasefrazier•1d ago•33 comments

Show HN: Cloneify – AI assistant that runs your business from WhatsApp/Slack

https://cloneify.ai
3•ad-tech•7h ago•1 comments

Show HN: AI Roundtable – Let 200 models debate your question

https://opper.ai/ai-roundtable/
109•felix089•2d ago•84 comments

Show HN: ProofShot – Give AI coding agents eyes to verify the UI they build

https://github.com/AmElmo/proofshot
154•jberthom•2d ago•96 comments

Show HN: DuckDB community extension for prefiltered HNSW using ACORN-1

https://github.com/cigrainger/duckdb-hnsw-acorn
89•cigrainger•1d ago•7 comments

Show HN: Email.md – Markdown to responsive, email-safe HTML

https://www.emailmd.dev/
371•dancablam•2d ago•94 comments

Show HN: Pgsemantic – Point at your Postgres DB, get vector search instantly

https://github.com/varmabudharaju/pgsemantic
13•varmabudharaju•1d ago•0 comments

Show HN: Alexandria, free open source news aggregation and classification suite

https://github.com/hephaistos-io/alexandria
5•RicDan•8h ago•0 comments

Show HN: Robust LLM extractor for websites in TypeScript

https://github.com/lightfeed/extractor
63•andrew_zhong•15h ago•43 comments

Show HN: Cq – Stack Overflow for AI coding agents

https://blog.mozilla.ai/cq-stack-overflow-for-agents/
221•peteski22•3d ago•98 comments

Show HN: Gemini can now natively embed video, so I built sub-second video search

https://github.com/ssrajadh/sentrysearch
426•sohamrj•2d ago•108 comments

Show HN: Gridland: make terminal apps that also run in the browser

https://www.gridland.io/
104•rothific•2d ago•13 comments

Show HN: Hooky – A lightweight HTTP webhook server written in Go

https://github.com/virtuallytd/hooky
2•virtuallytd•10h ago•0 comments

Show HN: Automate your workflow in plain English

https://www.operator23.com/
11•Mrakermo•22h ago•7 comments