frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Git-am applies commit message diffs

https://lore.kernel.org/git/bcqvh7ahjjgzpgxwnr4kh3hfkksfruf54refyry3ha7qk7dldf@fij5calmscvm/
1•rkta•2m ago•0 comments

ClawEmail: 1min setup for OpenClaw agents with Gmail, Docs

https://clawemail.com
1•aleks5678•8m ago•1 comments

UnAutomating the Economy: More Labor but at What Cost?

https://www.greshm.org/blog/unautomating-the-economy/
1•Suncho•15m ago•1 comments

Show HN: Gettorr – Stream magnet links in the browser via WebRTC (no install)

https://gettorr.com/
1•BenaouidateMed•16m ago•0 comments

Statin drugs safer than previously thought

https://www.semafor.com/article/02/06/2026/statin-drugs-safer-than-previously-thought
1•stareatgoats•18m ago•0 comments

Handy when you just want to distract yourself for a moment

https://d6.h5go.life/
1•TrendSpotterPro•20m ago•0 comments

More States Are Taking Aim at a Controversial Early Reading Method

https://www.edweek.org/teaching-learning/more-states-are-taking-aim-at-a-controversial-early-read...
1•lelanthran•21m ago•0 comments

AI will not save developer productivity

https://www.infoworld.com/article/4125409/ai-will-not-save-developer-productivity.html
1•indentit•26m ago•0 comments

How I do and don't use agents

https://twitter.com/jessfraz/status/2019975917863661760
1•tosh•32m ago•0 comments

BTDUex Safe? The Back End Withdrawal Anomalies

1•aoijfoqfw•35m ago•0 comments

Show HN: Compile-Time Vibe Coding

https://github.com/Michael-JB/vibecode
5•michaelchicory•37m ago•1 comments

Show HN: Ensemble – macOS App to Manage Claude Code Skills, MCPs, and Claude.md

https://github.com/O0000-code/Ensemble
1•IO0oI•41m ago•1 comments

PR to support XMPP channels in OpenClaw

https://github.com/openclaw/openclaw/pull/9741
1•mickael•41m ago•0 comments

Twenty: A Modern Alternative to Salesforce

https://github.com/twentyhq/twenty
1•tosh•43m ago•0 comments

Raspberry Pi: More memory-driven price rises

https://www.raspberrypi.com/news/more-memory-driven-price-rises/
1•calcifer•48m ago•0 comments

Level Up Your Gaming

https://d4.h5go.life/
1•LinkLens•52m ago•1 comments

Di.day is a movement to encourage people to ditch Big Tech

https://itsfoss.com/news/di-day-celebration/
3•MilnerRoute•54m ago•0 comments

Show HN: AI generated personal affirmations playing when your phone is locked

https://MyAffirmations.Guru
4•alaserm•54m ago•3 comments

Show HN: GTM MCP Server- Let AI Manage Your Google Tag Manager Containers

https://github.com/paolobietolini/gtm-mcp-server
1•paolobietolini•56m ago•0 comments

Launch of X (Twitter) API Pay-per-Use Pricing

https://devcommunity.x.com/t/announcing-the-launch-of-x-api-pay-per-use-pricing/256476
1•thinkingemote•56m ago•0 comments

Facebook seemingly randomly bans tons of users

https://old.reddit.com/r/facebookdisabledme/
1•dirteater_•57m ago•1 comments

Global Bird Count Event

https://www.birdcount.org/
1•downboots•58m ago•0 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
2•soheilpro•59m ago•0 comments

Jon Stewart – One of My Favorite People – What Now? with Trevor Noah Podcast [video]

https://www.youtube.com/watch?v=44uC12g9ZVk
2•consumer451•1h ago•0 comments

P2P crypto exchange development company

1•sonniya•1h ago•0 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
2•jesperordrup•1h ago•0 comments

Write for Your Readers Even If They Are Agents

https://commonsware.com/blog/2026/02/06/write-for-your-readers-even-if-they-are-agents.html
1•ingve•1h ago•0 comments

Knowledge-Creating LLMs

https://tecunningham.github.io/posts/2026-01-29-knowledge-creating-llms.html
1•salkahfi•1h ago•0 comments

Maple Mono: Smooth your coding flow

https://font.subf.dev/en/
1•signa11•1h ago•0 comments

Sid Meier's System for Real-Time Music Composition and Synthesis

https://patents.google.com/patent/US5496962A/en
1•GaryBluto•1h ago•1 comments
Open in hackernews

Show HN: Rhesis – Open-source platform for collaborative LLM application testing

https://github.com/rhesis-ai/rhesis
3•nicolaib•2mo ago
Hi HN, I'm Nicolai. I'm working with a small team in Germany on Rhesis, an open-source platform for testing conversational LLM applications and agents. We’re sharing an early community preview today.

Why we built this: We saw teams repeatedly struggle with testing, e.g. scattered test cases, unclear or inconsistent metrics, and a lot of manual effort that still missed obvious failures before production. Most tools assume a single developer runs evals alone; in practice, testing tends to involve PMs, domain experts, QA, and engineers. We built Rhesis to make that collaboration straightforward.

What it does: Rhesis is a self-hostable platform (with UI) where teams can create, run, and review tests for conversational AI systems.

A few core ideas:

- Test generation: Create and run tests for single-turns or full conversations; the platform can also assist with generating both single- and multi-turn scenarios using your domain context.

- Domain context / knowledge: Provide background material to guide test creation so you’re not starting from an empty prompt.

- Collaboration tools: Non-technical teammates can write test cases, leave comments, and review results; developers can dig into failures with detailed traces and outputs.

- Unified metrics: Bring in eval metrics from DeepEval, RAGAS, and similar OSS frameworks without re-implementing them.

Current state: Still early. We shipped v0.4.2 last week with a zero-config Docker setup. Core flows work, but there are rough edges. Everything is MIT-licensed; an enterprise edition will come later, but the OSS core will remain free. We’re currently focused on conversational applications because that’s where we saw the biggest pain in evaluation and QA workflows.

Links: App: app.rhesis.ai

GitHub: github.com/rhesis-ai/rhesis

Docs: docs.rhesis.ai

Happy to hear your thoughts and any answer questions about platform design, the architecture, or our thinking on collaborative testing workflows.

Comments

lunarain•2mo ago
Interesting, do you support multi-turn prompts evals as well?
nicolaib•2mo ago
Yes, we do. We developed Penelope for this, which is an autonomous testing agent that executes complex, multi-turn test scenarios against conversational AI systems.

https://github.com/rhesis-ai/rhesis/tree/main/penelope