frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Optical Combs Help Radio Telescopes Work Together

https://hackaday.com/2026/02/03/optical-combs-help-radio-telescopes-work-together/
1•toomuchtodo•2m ago•1 comments

Show HN: Myanon – fast, deterministic MySQL dump anonymizer

https://github.com/ppomes/myanon
1•pierrepomes•8m ago•0 comments

The Tao of Programming

http://www.canonical.org/~kragen/tao-of-programming.html
1•alexjplant•9m ago•0 comments

Forcing Rust: How Big Tech Lobbied the Government into a Language Mandate

https://medium.com/@ognian.milanov/forcing-rust-how-big-tech-lobbied-the-government-into-a-langua...
1•akagusu•9m ago•0 comments

PanelBench: We evaluated Cursor's Visual Editor on 89 test cases. 43 fail

https://www.tryinspector.com/blog/code-first-design-tools
2•quentinrl•12m ago•1 comments

Can You Draw Every Flag in PowerPoint? (Part 2) [video]

https://www.youtube.com/watch?v=BztF7MODsKI
1•fgclue•17m ago•0 comments

Show HN: MCP-baepsae – MCP server for iOS Simulator automation

https://github.com/oozoofrog/mcp-baepsae
1•oozoofrog•20m ago•0 comments

Make Trust Irrelevant: A Gamer's Take on Agentic AI Safety

https://github.com/Deso-PK/make-trust-irrelevant
2•DesoPK•24m ago•0 comments

Show HN: Sem – Semantic diffs and patches for Git

https://ataraxy-labs.github.io/sem/
1•rs545837•26m ago•1 comments

Hello world does not compile

https://github.com/anthropics/claudes-c-compiler/issues/1
6•mfiguiere•32m ago•0 comments

Show HN: ZigZag – A Bubble Tea-Inspired TUI Framework for Zig

https://github.com/meszmate/zigzag
2•meszmate•34m ago•0 comments

Metaphor+Metonymy: "To love that well which thou must leave ere long"(Sonnet73)

https://www.huckgutman.com/blog-1/shakespeare-sonnet-73
1•gsf_emergency_6•36m ago•0 comments

Show HN: Django N+1 Queries Checker

https://github.com/richardhapb/django-check
1•richardhapb•51m ago•1 comments

Emacs-tramp-RPC: High-performance TRAMP back end using JSON-RPC instead of shell

https://github.com/ArthurHeymans/emacs-tramp-rpc
1•todsacerdoti•56m ago•0 comments

Protocol Validation with Affine MPST in Rust

https://hibanaworks.dev
1•o8vm•1h ago•1 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
3•gmays•1h ago•0 comments

Show HN: Zest – A hands-on simulator for Staff+ system design scenarios

https://staff-engineering-simulator-880284904082.us-west1.run.app/
1•chanip0114•1h ago•1 comments

Show HN: DeSync – Decentralized Economic Realm with Blockchain-Based Governance

https://github.com/MelzLabs/DeSync
1•0xUnavailable•1h ago•0 comments

Automatic Programming Returns

https://cyber-omelette.com/posts/the-abstraction-rises.html
1•benrules2•1h ago•1 comments

Why Are There Still So Many Jobs? The History and Future of Workplace Automation [pdf]

https://economics.mit.edu/sites/default/files/inline-files/Why%20Are%20there%20Still%20So%20Many%...
2•oidar•1h ago•0 comments

The Search Engine Map

https://www.searchenginemap.com
1•cratermoon•1h ago•0 comments

Show HN: Souls.directory – SOUL.md templates for AI agent personalities

https://souls.directory
1•thedaviddias•1h ago•0 comments

Real-Time ETL for Enterprise-Grade Data Integration

https://tabsdata.com
1•teleforce•1h ago•0 comments

Economics Puzzle Leads to a New Understanding of a Fundamental Law of Physics

https://www.caltech.edu/about/news/economics-puzzle-leads-to-a-new-understanding-of-a-fundamental...
3•geox•1h ago•1 comments

Switzerland's Extraordinary Medieval Library

https://www.bbc.com/travel/article/20260202-inside-switzerlands-extraordinary-medieval-library
2•bookmtn•1h ago•0 comments

A new comet was just discovered. Will it be visible in broad daylight?

https://phys.org/news/2026-02-comet-visible-broad-daylight.html
4•bookmtn•1h ago•0 comments

ESR: Comes the news that Anthropic has vibecoded a C compiler

https://twitter.com/esrtweet/status/2019562859978539342
2•tjr•1h ago•0 comments

Frisco residents divided over H-1B visas, 'Indian takeover' at council meeting

https://www.dallasnews.com/news/politics/2026/02/04/frisco-residents-divided-over-h-1b-visas-indi...
5•alephnerd•1h ago•5 comments

If CNN Covered Star Wars

https://www.youtube.com/watch?v=vArJg_SU4Lc
1•keepamovin•1h ago•1 comments

Show HN: I built the first tool to configure VPSs without commands

https://the-ultimate-tool-for-configuring-vps.wiar8.com/
2•Wiar8•1h ago•3 comments
Open in hackernews

Show HN: Rhesis – Open-source platform for collaborative LLM application testing

https://github.com/rhesis-ai/rhesis
3•nicolaib•2mo ago
Hi HN, I'm Nicolai. I'm working with a small team in Germany on Rhesis, an open-source platform for testing conversational LLM applications and agents. We’re sharing an early community preview today.

Why we built this: We saw teams repeatedly struggle with testing, e.g. scattered test cases, unclear or inconsistent metrics, and a lot of manual effort that still missed obvious failures before production. Most tools assume a single developer runs evals alone; in practice, testing tends to involve PMs, domain experts, QA, and engineers. We built Rhesis to make that collaboration straightforward.

What it does: Rhesis is a self-hostable platform (with UI) where teams can create, run, and review tests for conversational AI systems.

A few core ideas:

- Test generation: Create and run tests for single-turns or full conversations; the platform can also assist with generating both single- and multi-turn scenarios using your domain context.

- Domain context / knowledge: Provide background material to guide test creation so you’re not starting from an empty prompt.

- Collaboration tools: Non-technical teammates can write test cases, leave comments, and review results; developers can dig into failures with detailed traces and outputs.

- Unified metrics: Bring in eval metrics from DeepEval, RAGAS, and similar OSS frameworks without re-implementing them.

Current state: Still early. We shipped v0.4.2 last week with a zero-config Docker setup. Core flows work, but there are rough edges. Everything is MIT-licensed; an enterprise edition will come later, but the OSS core will remain free. We’re currently focused on conversational applications because that’s where we saw the biggest pain in evaluation and QA workflows.

Links: App: app.rhesis.ai

GitHub: github.com/rhesis-ai/rhesis

Docs: docs.rhesis.ai

Happy to hear your thoughts and any answer questions about platform design, the architecture, or our thinking on collaborative testing workflows.

Comments

lunarain•2mo ago
Interesting, do you support multi-turn prompts evals as well?
nicolaib•2mo ago
Yes, we do. We developed Penelope for this, which is an autonomous testing agent that executes complex, multi-turn test scenarios against conversational AI systems.

https://github.com/rhesis-ai/rhesis/tree/main/penelope