Hi HN,
We're excited to share GroundCite, an open-source, Gemini-based multi-agent library designed to solve one of the biggest headaches in using powerful LLMs for reliable, grounded research: invalid, irrelevant, or broken citations.
The Problem
While Google Gemini's grounding feature is powerful, relying on it for high-stakes, trustworthy content often results in sources that are:
1. Irrelevant: The source doesn't actually support the claim.
2. Unreliable: The model cites a site you know to be low-quality or out of date.
3. Broken/Invalid: The URL is dead or the cited snippet is missing.
This fundamentally undermines the research integrity of LLM outputs.
Our Solution: GroundCite
GroundCite acts as a smart layer around the core Gemini model to enforce accuracy and provide developers with granular control over the grounding process.
It utilizes a multi-agent architecture to validate and filter sources before the final answer is generated, turning Gemini into a much more reliable research assistant.
Key Features:
• Source Filtering & Control: You can explicitly define a list of trusted sites for the model to use, or, critically, exclude unreliable sources (like forums or outdated wikis) from the grounding pool.
• Active Citation Validation: The library validates the generated source URLs and confirms that the content within them actually supports the model's claim.
• Enhanced Reliability: By forcing smarter, controlled grounding and citation practices, GroundCite significantly boosts the trustworthiness and accuracy of the output.
We built this out of necessity and decided to open-source it for the wider community struggling with reliable AI research.
Links
• Playground / Demo: See the difference controlled grounding makes and try it before integrating:
https://groundcite.cennest.com/
• GitHub Repository: We welcome stars, issues, and contributions!
https://github.com/cennest/ground-cite
• Technical Articles / Full Story:
https://www.cennest.com/category/groundcite/
We are looking for feedback on the multi-agent architecture, potential use cases, and how we can best extend this to support other LLMs and grounding services.
Thanks!