frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Show HN: Open Evaluation

https://openevaluation.ai/
3•cjcenizal•4h ago
Hey HN, this will likely interest you if you're a) into dense data visualization or b) trying to figure out how to measure the quality of a retrieval-augmented generation (RAG) system.

There's an OSS tool called Open RAG Eval that analyzes RAG-based query-and-answer sets to generate a dense set of metrics in an "evaluation report". This report is in CSV format and the data is basically impossible for a human to read because there's so much of it.

I built Open Evaluation to enable folks to load in a report and visualize the evaluation metrics in a more human-readable way. The challenge was the sheer amount of information to visualize. I went with a collapsible table with sticky headers to presenting the info, so you can compare metrics across reports and questions. I also tried to make everything clickable, so if you want to understand the meaning behind a metric you can just click it to open up an info panel to learn more about it.

The site has built-in sample evaluation reports, so you can try it out without needing to generate your own reports. If you give it a shot please share your feedback. I'd love to find ways to make this more usable.

Full disclosure: I did this for work and my coworkers also made Open RAG Eval.

VisionOS Godot Engine support merged

1•iFire•12s ago•0 comments

Shipwrecks from John Franklin Doomed Arctic Expedition Where Inuit Said

https://www.smithsonianmag.com/history/the-shipwrecks-from-john-franklins-doomed-arctic-expedition-were-exactly-where-the-inuit-said-they-would-be-180986627/
1•bookofjoe•1m ago•0 comments

Getting Kids Interested in Coding

https://martiancraft.com/blog/2025/05/kids-coding/
1•ingve•3m ago•0 comments

Autoselect the best AI model for any health question using HealthBench scores

https://twitter.com/yaroshipilov/status/1924926821201805625
1•shipilovya•3m ago•1 comments

Cheating Expert Answers Casino Cheating Questions [video]

https://www.youtube.com/watch?v=0QWP4IZOu0I
1•Bluestein•4m ago•0 comments

Developers: Is training taking a back seat?

https://www.techradar.com/pro/developers-is-training-taking-a-back-seat
1•mooreds•6m ago•0 comments

Arch Linux on Mac Pro 1.1/2.1

https://wiki.ponovo.rs/wiki/Linux_on_Mac_Pro_1.1/2.1
1•marxo•7m ago•0 comments

FastAPI and Next.js User Auth

https://www.david-crimi.com/blog/user-auth
1•Crimid01•7m ago•0 comments

Disable debuginfo to improve Rust compile times

https://kobzol.github.io/rust/rustc/2025/05/20/disable-debuginfo-to-improve-rust-compile-times.html
1•ingve•8m ago•0 comments

Things money can't buy – like happiness and better health

https://news.harvard.edu/gazette/story/2025/05/things-money-cant-buy-like-happiness-and-better-health/
2•gnabgib•8m ago•0 comments

System-wide Technology Outage

https://ketteringhealth.org/system-wide-technology-outage/
1•mattbsheets•12m ago•0 comments

Strands Agents, an Open Source AI Agents SDK

https://aws.amazon.com/blogs/opensource/introducing-strands-agents-an-open-source-ai-agents-sdk/
1•jaredwiener•13m ago•0 comments

Show HN: Copilot Audit – PDF->Excel with AI

https://copilot-audit.com/
1•miwend•15m ago•0 comments

Telekons: Some high performance code in Common Lisp

https://github.com/telekons
2•doener•17m ago•0 comments

The Netfarm Suite: a replicated, mostly-trustless object system

https://gitlab.com/cal-coop/netfarm
2•doener•18m ago•0 comments

Text reminders for court hearings can boost justice system efficiency

https://www.route-fifty.com/digital-government/2025/05/report-text-reminders-court-hearings-can-help-boost-justice-system-efficiency/405352/
3•gnabgib•18m ago•0 comments

Our Journey Through Linux/Unix Landscapes

https://blog.kalvad.com/our-journey-through-linux-unix-landscapes/
2•alekq•18m ago•0 comments

Cheers star George Wendt dies at 76

https://www.bbc.com/news/articles/cx2xx998102o
2•austinallegro•18m ago•1 comments

Jason Padgett

https://en.wikipedia.org/wiki/Jason_Padgett
2•sans_souse•19m ago•1 comments

Show HN: I made a tool to repurpose a TikTok video from another account

https://github.com/best-trading-indicator-tools/10XReach
2•Daveatt•20m ago•0 comments

Matrix Governing Board Elections 2025

https://matrix.org/foundation/governing-board-elections/2025/#nominees
1•doener•23m ago•0 comments

Why Good Programmers Use Bad AI

https://nmn.gl/blog/ai-and-programmers
3•namanyayg•24m ago•1 comments

Project Mariner – Browser-Based AI Agent

https://deepmind.google/models/project-mariner/
6•jenthoven•25m ago•0 comments

SynthID Detector – a new portal to help identify AI-generated content

https://blog.google/technology/ai/google-synthid-ai-content-detector/
1•rob•25m ago•0 comments

Feels great to finish CRUD for my ERP app

https://github.com/oitcode/samarium
2•ignosnim•28m ago•1 comments

Intel explores sale of networking and edge unit

https://www.reuters.com/technology/intel-explores-sale-networking-edge-unit-sources-say-2025-05-20/
2•mfiguiere•29m ago•0 comments

Jupiter was formerly twice its current size and had much stronger magnetic field

https://phys.org/news/2025-05-jupiter-current-size-stronger-magnetic.html
2•amichail•30m ago•0 comments

A context-aware LLM agent built directly into Grafana Cloud

https://grafana.com/blog/2025/05/07/llm-grafana-assistant/
5•matryer•32m ago•0 comments

The Onion's Ben Collins Knows How to Save Media

https://www.vanityfair.com/hollywood/story/the-onions-ben-collins-knows-how-to-save-media
3•coloneltcb•32m ago•0 comments

Reachability analysis tool for Linux kernel CVEs

https://github.com/udibabaskydeck/ralk
1•ATechGuy•34m ago•0 comments