frontpage.

Have you ever wonder if your SDKs is friendly for Agentic AI like Claude Code or Codex? I built an opensource (Apache 2.0) CLI that answer that question for you.

With it you can create a test suite either manually or with an Agent based on the source code and documentation. The CLI will dispatch Agents with their own sandboxed microVMs to solve each test. Results then get graded by another Judge Agent.

Test-taker agents only have access to public information (guides, blogs, package metadata), while Judge agents have access to both public and private information (source code, internal documents)

After the test result are generated you can make improvement to your SDK manually, or use an Agent to automate the process.

Agents are sandboxed, this means: - Host machine secrets (API keys) are not exposed to the sandbox environment - Egress HTTP requests are monitored, Judge agents' egress are limited to trusted domains to ensure that proprietary IP are not exfiltrated

Features: - CLI commands for the entire workflow of generating, eval, reporting on test suites - Agent skills for each command - Local Web UI if you want to inspect test result and edit test cases visually

GitHub: https://github.com/PSPDFKit-labs/agentic-usability

How to Leverage IPv6 Subnets for Infinite Proxy Rotation

Ateneo VR escape room game teaches Philippine Martial Law to a new generation

ZX Spectrum Archive

Show HN: Async multi-person collaboration skill for Claude Code

Politico execs meet staff, letter warns CEO risks 'undermining our reputation'

The Long-Term Effects of Feeding Lionfish to Sharks and Groupers on the Reefs [video]

Song Sung Blue: From Barstool to Big Screen

Build a voice agent using Soniox STT and TTS

The CNRS is calculating digital environmental footprints

Show HN: Minimal Linux sandboxes to manage AI-Generated Code with ease

AOMedia Releases Polygonal Mesh Coding Standard Reference Software

Binary 2Pac

When model distillation becomes a diplomatic incident

We moved our blog off Webflow and what it cost us

China surpasses US in research spending

Lovable: We're Currently Experiencing Issues

Why the same LLM gives different answers in different environments

Greenest countries eye drilling as fix for Iran crisis

If this doesn't scream AI bubble is about to burst IDK what does

Goodbye Tim Apple – daily.dev Show [video]

What Type of AI Usage?

AI Is Cannibalizing Human Intelligence

$1,605: average annual ad value of a U.S. Google user

A Field Guide to Bugs

Phony whistleblowers, fake journalists and cyber spies

AI Workflows Need Provider Escape Hatches

Comparing SBC prices in 2024 and 2026

GitHub Copilot code review will start consuming GitHub Actions minutes

AI prefers resumes written by itself: Self-preferencing in Algorithmic Hiring

Notice of Obsolescence

Show HN: I built a way to see if your SDK is AI-friendly