frontpage.

MCP servers are proliferating fast, but most have vague tool descriptions and incomplete schemas that make LLMs pick the wrong tool or fill parameters incorrectly.

AgentDX is a CLI that measures this. Two commands:

- `npx agentdx lint` — static analysis of tool descriptions, schemas, and naming. 18 rules, zero config, no API key. Produces a lint score.

- `npx agentdx bench` — sends your tool definitions to an LLM (Anthropic, OpenAI, or Ollama) and evaluates tool selection accuracy, parameter correctness, ambiguity handling, multi-tool orchestration, and error recovery. Produces an Agent DX Score (0-100).

It auto-detects the server entry point, spawns it, connects as an MCP client, and reads tools via the protocol. Bench auto-generates test scenarios from your tool definitions.

Built in TypeScript, MIT licensed. Early alpha — the bench command works but is slow (sequential LLM calls, parallelization is next). Feedback welcome.

As HN: Why is no one using my free library

Insights from Multilingual Curation for a 20T-Token Dataset

Mark Zuckerberg set to take the stand at landmark trial

Show HN: A public map of startups worldwide (anyone can add theirs)

Daily nightmare descends on Tesla charging lot in San Francisco

Current – New RSS Reader

Mark Zuckerberg testifies at social media addiction trial

Paperclip Reforged – A from-scratch remake of Universal Paperclips

Constructing Unlearnable Data with Solely Linear Classifiers

Mark Zuckerberg testifies at landmark social media addiction trial

Luxury hotel scammer booked rooms for a cent, altered payment validation system

Ask HN: Are Snaps (Cannnonical) worth it?

Show HN: CasperAI – A local MCP server for cross-platform engineering context

Show HN: Kindred – Find people interested in what you're building

Show HN: Agent Democracy Protocol – AI agents that vote and pool resources

ArXiv paper –> visually appealing video explanations

Claude Briefly Experiences Outage as Users Report Chat Issues

How to Ace a Job Interview with an AI

A roadmap for evaluating moral competence in large language models

Show HN: Fory C++ Serialization – Polymorphism, Circular Refs, 12x vs. Protobuf

A Global Web of Chinese Propaganda Leads to a U.S. Tech Mogul (2023)

Zero Agent Gate: Agent-to-Service Auth That Keeps Secrets Out of the LLM

Vault (organelle)

Open Source Book: Let Erlang Crash

I'm Building OpenClaw Skills for Nonprofit RBM Logic Models

Solving Systems of Equations Faster

Beyond AlphaFold

Arizona Bill Requires Age Verification for All Apps

Show HN: Agent Paperclip: A Desktop "Clippy" That Monitors Claude Code/Codex

Reader blind test 2026: The community sees DLSS 4.5 clearly ahead of FSR/Native

Show HN: AgentDX – Open-source linter and LLM benchmark for MCP servers