frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

MiRAGE: Open-source framework for multimodal RAG evaluation

1•mmhetric•1h ago
Code: https://github.com/ChandanKSahu/MiRAGE

Hi HN, we are the authors of MiRAGE.

We built this because standard RAG benchmarks (like Natural Questions) rely on text-only Wikipedia-like data, which doesn't reflect the reality of enterprise RAG. In the real world, "truth" is often locked in a chart, a complex table, or a diagram deep inside a PDF.

MiRAGE is an open-source framework that uses a swarm of specialized agents to reverse-engineer evaluation datasets from your own documents.

How it works:

1. Ingest: It uses vision models to describe charts/tables and "semantically chunk" the PDF.

2. Generate: An agent swarm (Generator, Retriever, Persona-Injector) creates multi-hop questions.

3. Verify: An adversarial "Verifier Agent" fact-checks the answers against the source to prevent hallucinated ground truth.

Key Finding: In our ablation studies, removing the adversarial verifier dropped the faithfulness of the generated dataset from 97% to 74%. Synthetic data needs self-verification.

Resources:

- Paper (arXiv): https://arxiv.org/abs/2601.15487 - Install: pip install mirage-benchmark - Demo: (See the terminal video in the repo)

We’d like your feedback, especially on the "Visual Grounding" challenge, it’s still the hardest part of multimodal RAG. Happy to answer any questions!

The mathematics of compression in database systems

https://www.bitsxpages.com/p/the-mathematics-of-compression-in
2•agavra•2m ago•0 comments

The Datacenter as a Computer (2013)

https://research.google/pubs/the-datacenter-as-a-computer-an-introduction-to-the-design-of-wareho...
1•tosh•3m ago•0 comments

Show HN: Open-Source SDK for AI Knowledge Work

https://github.com/ClioAI/kw-sdk
1•ankit219•4m ago•0 comments

Study: LLMs found to echo false claims in medical notes and social media

https://www.mountsinai.org/about/newsroom/2026/can-medical-ai-lie-large-study-maps-how-llms-handl...
1•giuliomagnifico•4m ago•0 comments

Hyundai Motor to supply 50k autonomous vehicles to Waymo by 2028

https://autonews.gasgoo.com/articles/icv/behind-a-potential-25b-deal-hyundai-and-waymo-tackle-sca...
1•ra7•5m ago•0 comments

Show HN: Deploy Multiple OpenClaw Assistants Easily

https://www.moltbot-online.com/
1•DSpider•5m ago•0 comments

Vibe Coding

https://tosbourn.com/vibe-coding/
2•tosbourn•5m ago•0 comments

AI workloads challenge the cattle model

https://varoa.net/2026/02/07/ai-workloads-challenge-the-cattle-model.html
1•srvaroa•5m ago•0 comments

The Singularity Will Occur on a Tuesday

https://campedersen.com/singularity
1•ecto•5m ago•0 comments

A Stanford Experiment to Pair 5,000 Singles Has Taken over Campus

https://www.wsj.com/lifestyle/relationships/stanford-students-experiment-dating-date-drop-92a4aea8
1•impish9208•5m ago•1 comments

Ask HN: What's your opinion on the Swisscows search engine?

1•palata•6m ago•0 comments

NYC subway stations by population in catchment area

https://www.anita.garden/assets/nycvoronoi.png
2•frenchman_in_ny•6m ago•0 comments

Free LLM API Resources – A List of Free LLM Inference APIs

https://github.com/cheahjs/free-llm-api-resources
1•willmarquis•6m ago•0 comments

Lokutor Orchestrator: A Go library for full-duplex, interruptible voice AI

https://github.com/lokutor-ai/lokutor-orchestrator
1•dani-lokutor•8m ago•1 comments

Show HN: HN Companion – web app that enhances the experience of reading HN

https://hncompanion.com
2•georgeck•8m ago•1 comments

"Hate brings views": Confessions of a London fake news TikToker

https://www.londoncentric.media/p/london-tiktok-fake-news-creator-hate-immigrants
2•pbshgthm•9m ago•0 comments

Show HN: I made an open source dashboard to track your Stripe and RevenueCat rev

https://ohmydashboard.com
1•guivr•10m ago•1 comments

Why some Canadians are betting big on 3D printed housing in Canada

https://www.cbc.ca/news/canada/3d-printing-houses-canada-9.7081720
3•cf100clunk•10m ago•0 comments

Show HN: ClearDemand – Cross-case search and drafting for injury firms

https://cleardemand.io/
2•Dave_stridefuel•12m ago•2 comments

Copilot SDK in Technical Preview

https://github.blog/changelog/2026-01-14-copilot-sdk-in-technical-preview/
1•tosh•12m ago•0 comments

Show HN: I made paperboat.website, a platform for friends and creativity

https://paperboat.website/home/
4•yethiel•12m ago•3 comments

Daylight Mirror: Mac on paperlike screen, 30fps <10ms, Opus 4.6 in <8 hours

https://twitter.com/_welf/status/2020608341035077834
1•welfvonhoeren•13m ago•0 comments

Show HN: A real-time collaborative word puzzle inspired by NYT Spelling Bee

https://wannabeewith.me/
2•catdeleon•14m ago•0 comments

Semaglutide improves knee osteoarthritis independant of weight loss

https://www.cell.com/cell-metabolism/abstract/S1550-4131(26)00008-2
2•randycupertino•15m ago•1 comments

Claude Feature Request: Support Agents.md

https://github.com/anthropics/claude-code/issues/6235
1•buchanae•16m ago•0 comments

AI Flattened the Engineering Ladder

https://ossama.is/blog/ladder
1•ossa-ma•16m ago•0 comments

Map showing most notable people per region

https://tjukanovt.github.io/notable-people
2•nilsherzig•16m ago•2 comments

America Isn't Ready for What AI Will Do to Jobs

https://www.theatlantic.com/magazine/2026/03/ai-economy-labor-market-transformation/685731/
2•fortran77•16m ago•2 comments

Show HN: Browse neologisms for the feelings and experiences English can't name

https://words.hails.info
1•djrhails•17m ago•1 comments

ICE Is Expanding Across the US at Breakneck Speed. Here's Where It's Going Next

https://www.wired.com/story/ice-expansion-across-us-at-heres-where-its-going-next/
3•coloneltcb•20m ago•2 comments