frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Rhesis – Open-source platform for collaborative LLM application testing

https://github.com/rhesis-ai/rhesis
1•nicolaib•1h ago
Hi HN, I'm Nicolai. I'm working with a small team in Germany on Rhesis, an open-source platform for testing conversational LLM applications and agents. We’re sharing an early community preview today.

Why we built this: We saw teams repeatedly struggle with testing: scattered test cases, unclear or inconsistent metrics, and a lot of manual effort that still missed obvious failures before production. Most tools assume a single developer runs evals alone; in practice, testing tends to involve PMs, domain experts, QA, and engineers. We built Rhesis to make that collaboration straightforward.

What it does: Rhesis is a self-hostable platform (with UI) where teams can create, run, and review tests for conversational AI systems. A few core ideas:

- Test generation: Create and run tests for single-turns or full conversations; the platform can also assist with generating both single- and multi-turn scenarios using your domain context. - Domain context / knowledge: Provide background material to guide test creation so you’re not starting from an empty prompt. - Collaboration tools: Non-technical teammates can write test cases, leave comments, and review results; developers can dig into failures with detailed traces and outputs. - Unified metrics: Bring in eval metrics from DeepEval, RAGAS, and similar OSS frameworks without re-implementing them.

Current state: Still early. We shipped v0.4.2 last week with a zero-config Docker setup. Core flows work, but there are rough edges. Everything is MIT-licensed; an enterprise edition will come later, but the OSS core will remain free. We’re currently focused on conversational applications because that’s where we saw the biggest pain in evaluation and QA workflows.

Links: App: app.rhesis.ai GitHub: github.com/rhesis-ai/rhesis Docs: docs.rhesis.ai

Happy to hear your thoughts and any answer questions about platform design, the architecture, or our thinking on collaborative testing workflows.

World Labs – Building 3D spatial-AI world models

https://www.worldlabs.ai/
1•Brysonbw•38s ago•0 comments

The Atom Bomb and Japanese Christianity [pdf]

https://isonomiaquarterly.com/wp-content/uploads/2025/11/iq-3.4-zellen-nagasaki.pdf
1•brandonlc•2m ago•1 comments

The House Draws the Line at Jeffrey Epstein

https://www.bloomberg.com/opinion/articles/2025-11-18/epstein-vote-is-congress-line-in-the-sand-w...
1•wslh•3m ago•1 comments

Dr. Fei-Fei Li on jobs, robots and why world models are next

https://www.youtube.com/watch?v=Ctjiatnd6Xk
1•Brysonbw•4m ago•0 comments

The False Glorification of Yann LeCun

https://garymarcus.substack.com/p/the-false-glorification-of-yann-lecun
1•guilamu•4m ago•0 comments

Paiml/Depyler: Compiles Python to Rust, Helping Transition to Rust Code

https://github.com/paiml/depyler
1•rbanffy•4m ago•0 comments

Beyond the Primary User: 3 Types of Smart-Home Users

https://www.nngroup.com/articles/smart-home-users/
1•ulrischa•5m ago•0 comments

Show HN: Polymarket/Kalshi Arbitrage Scanner Powered by Gemini Pro 3

https://arb.carolinacloud.io/
1•bojangleslover•5m ago•0 comments

Cloudflare CTO: This was not an attack

https://twitter.com/dok2001/status/1990791419653484646
2•doener•6m ago•0 comments

Chicken Caesars: they're messing with your Bluesky feed

https://thedabbler.patatas.ca/pages/bluesky-caesars.html
2•Sophira•10m ago•0 comments

Text to CAD for Aircraft Design

https://strato.so/
1•k1a11220•10m ago•0 comments

Meta Did Not Violate Antitrust Law, Judge Rules

https://www.nytimes.com/2025/11/18/technology/meta-antitrust-monopoly-ruling.html
6•lateforwork•11m ago•0 comments

Intel Lass Feature Looks Like It Will Be Upstreamed for Linux 6.19

https://www.phoronix.com/news/Intel-LASS-For-Linux-6.19
3•doener•11m ago•0 comments

Red Hat Losing Another Longtime and Prominent Linux Kernel Engineer

https://www.phoronix.com/news/Red-Hat-David-H-Leaving
2•Bender•11m ago•0 comments

A Week with Elixir (2013)

https://joearms.github.io/published/2013-05-31-a-week-with-elixir.html
1•giancarlostoro•12m ago•0 comments

Electric motorcycle folds down to the size of a carry-on suitcase

https://electrek.co/2025/11/18/this-electric-motorcycle-folds-down-to-the-size-of-a-carry-on-suit...
1•Bender•13m ago•1 comments

Energy and AI – Analysis

https://www.iea.org/reports/energy-and-ai
1•Anon84•14m ago•0 comments

Cloudflare Outage Not Caused by Cyberattack

https://www.securityweek.com/cloudflare-says-highly-disruptive-outage-not-caused-by-attack/
2•Bender•14m ago•0 comments

How Not to Lose Your IP When Developing a Product with Your China Factory (2020)

https://harris-sliwoski.com/chinalawblog/how-not-to-lose-your-ip-when-developing-a-product-with-y...
2•DustinEchoes•14m ago•0 comments

Researchers find the gas pedal and brake for anxiety, and they aren't neurons

https://www.psypost.org/researchers-find-the-gas-pedal-and-brake-for-anxiety-and-they-arent-neurons/
2•glassounds•15m ago•0 comments

Why People Become Overweight

https://www.health.harvard.edu/staying-healthy/why-people-become-overweight
2•paulpauper•15m ago•1 comments

CS/ML PhD: Debating Between Internship and Full-Time Offers

1•ynliPbqM•17m ago•0 comments

Meta-analysis of resting metabolic rate in formerly obese subjects

https://pubmed.ncbi.nlm.nih.gov/10357728/
1•paulpauper•19m ago•0 comments

Happy holiday shopping season in the low-trust economy

https://blog.zgp.org/happy-holiday-shopping-season/
1•speckx•19m ago•0 comments

Who has the biggest footprint on the Web?

1•Pocomon•22m ago•0 comments

Choosing a Vector Database for Reddit

https://old.reddit.com/r/RedditEng/comments/1ozxnjc/choosing_a_vector_database_for_ann_search_at/
1•softwaredoug•25m ago•0 comments

New Arduino Privacy Policy: "user shall not [...] reverse-engineer the platform"

https://bsky.app/profile/ptorrone.bsky.social/post/3m5wcakoip22u
3•gregsadetsky•25m ago•0 comments

Post-Quantum Cryptography in .NET

https://devblogs.microsoft.com/dotnet/post-quantum-cryptography-in-dotnet/
1•doomroot13•28m ago•0 comments

Introducing flat-rate pricing plans with no overages

https://aws.amazon.com/blogs/networking-and-content-delivery/introducing-flat-rate-pricing-plans-...
3•cristiangraz•28m ago•0 comments

UC Berkeley scientists hail breakthrough in decoding whale communication

https://www.sfgate.com/bayarea/article/scientists-breakthrough-decoding-whales-21184413.php
1•joak•29m ago•0 comments