frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: With Promptfoo acquired by OpenAI, what are MCP devs using for testing?

2•warmcat•1h ago
With OpenAI's acquisition of Promptfoo last week, I've been thinking about the testing gap for MCP servers specifically. Promptfoo was great for LLM evaluation but didn't handle MCP's transport layer, tool schema validation, or MCP-specific vulnerabilities like Tool Poisoning.

What are people using to test their MCP servers in CI? The MCP Inspector is interactive-only.

I've personally been building MCPSpec [https://light-handle.github.io/mcpspec/] but curious what approaches others are taking custom scripts, unit tests on server internals, something else?

Comments

rajivnaskar•1h ago
One thing I've been running into when working with LLM tooling is that a lot of the friction happens even before testing — just preparing context for the model.
westurner•19m ago
/? mcp test: https://hn.algolia.com/?q=mcp+test :

MCP Playground, MCPSpec, MCPjam, MCP Inspector, mcp-record, mcpbr, agent-vcr

mcpbr > Supported Benchmarks : https://github.com/supermodeltools/mcpbr#supported-benchmark... :

> MCPToolBench++, ToolBench, AgentBench, WebArena, TerminalBench, InterCode; SWE-bench,

warmcat•4m ago
This is a good list. Are you finding these can be integrated into your CI/CD workflows?

Blocking the Internet Archive Won't Stop AI, but It Will Erase the Web's History

https://www.eff.org/deeplinks/2026/03/blocking-internet-archive-wont-stop-ai-it-will-erase-webs-h...
1•doener•18s ago•0 comments

3 months in production: Architecture of an autonomous AI pipeline

https://gammavibe.com/updates/autonomous-startup-generator-architecture/
1•digitalhobbit•45s ago•1 comments

Coding with agents feels like a chess simul

https://tobeva.com/articles/chess-simul/
1•pbw•1m ago•0 comments

Show HN: Hubcap – A single Go binary to automate Chrome via CDP

https://tomyandell.dev/blog/introducing-hubcap
1•tomyandell•1m ago•0 comments

When Humanoid Robots Come to a Small Town Factory in South Carolina

https://www.wsj.com/business/south-carolina-schaeffler-plant-robots-d56c91d0
1•JeanKage•2m ago•0 comments

Lewis 1.0 – 8B model trained on AI social data beats Sonnet on personality dvgnc

https://github.com/swarmgram/swarmgrampublic
2•swarmgram•2m ago•0 comments

Drawvg Filter for FFmpeg

https://ayosec.github.io/ffmpeg-drawvg/
1•nolta•2m ago•0 comments

Cuba's power system suffers total collapse

https://www.cnn.com/2026/03/16/americas/cuba-power-grid-collapse-intl-latam
1•randycupertino•2m ago•1 comments

Microsoft's 'unhackable' Xbox One has been hacked by 'Bliss'

https://www.tomshardware.com/video-games/console-gaming/microsofts-unhackable-xbox-one-has-been-h...
1•crtasm•3m ago•0 comments

The serious part-time unofficial hobby of videogame fan-translation

https://medium.com/@hilltopmailer/the-deadly-serious-part-time-unofficial-hobby-of-videogame-fan-...
1•AdmiralAsshat•3m ago•0 comments

What 100M Volts Do to the Body and Mind

https://www.theatlantic.com/magazine/2026/04/lightning-strike-survivors-body-mind/686057/
1•gmays•6m ago•0 comments

Large craters offer clues to the origin of asteroid 16 Psyche

https://phys.org/news/2026-03-large-craters-clues-asteroid-psyche.html
1•Brajeshwar•6m ago•0 comments

Praxis – an AI-native intermediate language for agentic workflows

https://github.com/cssmith615/praxis
1•cssmith615•7m ago•0 comments

Sparkling water mitigates cognitive fatigue during prolonged esports play

https://www.sciencedirect.com/science/article/pii/S2451958826000175
1•PaulHoule•7m ago•0 comments

Filtering through all the spam/crap on Amazon

https://bindiving.com/
1•cazzer•8m ago•0 comments

OpenReview: Open-source, self-hosted AI code review bot powered by Vercel

https://github.com/vercel-labs/openreview
4•nateb2022•9m ago•1 comments

Show HN: MCP server that turns local files into a shareable link instantly

https://github.com/file-kiwi/filekiwi-mcp-server
1•hwovh•9m ago•0 comments

Worse than kids stomping that pig's bladder (2023)

https://taylor.town/pigs-bladder
1•surprisetalk•11m ago•0 comments

Features of Fully-Fledged card systems (2017)

https://thewrongtools.wordpress.com/2017/10/15/features-of-fully-fledged-card-systems/
1•surprisetalk•11m ago•0 comments

Stop Complying

https://gomakethings.com/stop-complying/
2•surprisetalk•11m ago•0 comments

Reasoning about performance (in the context of search) [video]

https://www.youtube.com/watch?v=80LKF2qph6I
1•surprisetalk•12m ago•0 comments

I built my own AI powered online learning tool

https://github.com/symbiont-ai/docent
1•chiefsymbiont•12m ago•0 comments

800 Lines of Python found the Sun's rotation from free NASA data

https://github.com/SaulVanCode/protoscience-nasa-experiments
2•strujillo•13m ago•0 comments

LLMs – How did they get so good?

https://www.manhattanmetric.com/blog/2026/03/how-did-llms-get-so-good
1•jballanc•13m ago•0 comments

The Former Academic Guiding OpenAI's Trillion-Dollar AI Buildout

https://www.bloomberg.com/news/articles/2026-03-11/the-former-academic-guiding-openai-s-trillion-...
2•gmays•14m ago•0 comments

Show HN: Need is a CLI tool discovery as an MCP server

https://www.agentneeds.dev/
3•schreibertuc•14m ago•0 comments

Third GE Vernova blade break at wind farm powering Google

https://www.windpowermonthly.com/article/1951775/third-ge-vernova-blade-break-wind-farm-powering-...
1•genve•15m ago•0 comments

Firefighting Drones Head to Aspen

https://arstechnica.com/gadgets/2026/03/firefighting-drones-head-to-aspen-can-they-suppress-a-bla...
1•LorenDB•16m ago•0 comments

Principality of Hutt River

https://en.wikipedia.org/wiki/Principality_of_Hutt_River
1•thunderbong•16m ago•1 comments

The best platform team ships zero tools

https://aurbano.eu/post/2026-03-13-best-platform-team-ships-zero-tools/
2•aurbano•17m ago•0 comments