I’ve been working on a system for people who want to understand what actually happened in a news story—without trusting a single outlet or a single summary.
Instead of producing another “AI summary,” the goal is to make the entire chain of reasoning transparent:
1. Pull multiple articles for the same event (left, center, right, wires, gov).
2. Extract atomic claims from all of them.
3. Retrieve the relevant evidence passages.
4. Run an MNLI model to classify each claim as Supported / Contradicted / Inconclusive.
5. Show a full receipt trail for every claim (source, quote, timestamp).
The output is less like “news” and more like a structured evidence map of the story.
Instead of focusing on “neutral summaries,” I’ve shifted to emphasizing transparency + multi-source evidence. The summary is just the last layer; the real value is in surfacing contradictions, missing context, and uncertainty.
I’m also working on:
• A browser extension that runs the analysis on whatever article you’re reading.
• A white-label API that outputs claims + evidence + MNLI verdicts for researchers / journalists.
How it works (technical overview)
Crawling / dedup
Scheduled scrapers + curated source lists. Clustering based on title/body similarity.
Backend in Python + PostgreSQL; front-end in Angular.
Server-rendered article pages for SEO + speed.
Where I’m unsure / what I’d love feedback on
1. MNLI limits
At what point should I move from vanilla MNLI to something more retrieval-augmented or fine-tuned for journalism-style claims?
2. Claim extraction reliability
Is it worth moving toward a more formal IE pipeline (NER + relation extraction + event frames), or does that add more complexity than it solves?
3. Uncertainty communication
How would you present “inconclusive” or low-confidence cases to non-technical readers without misleading them?
4. Evaluation methodology
What would a convincing benchmark look like? I have offline accuracy for several classifiers, but I haven’t found good public datasets specifically for multi-source contradictory claims.
If you see conceptual flaws or think this approach is risky, I’m genuinely open to hearing strong arguments against it.
MarcellLunczer•1h ago
I’ve been working on a system for people who want to understand what actually happened in a news story—without trusting a single outlet or a single summary.
Instead of producing another “AI summary,” the goal is to make the entire chain of reasoning transparent:
1. Pull multiple articles for the same event (left, center, right, wires, gov).
2. Extract atomic claims from all of them.
3. Retrieve the relevant evidence passages.
4. Run an MNLI model to classify each claim as Supported / Contradicted / Inconclusive.
5. Show a full receipt trail for every claim (source, quote, timestamp).
The output is less like “news” and more like a structured evidence map of the story.
Links (no signup):
• News pages: https://neutralnewsai.com
• Analyzer (paste any URL): https://neutralnewsai.com/analyzer
• Methodology: https://neutralnewsai.com/methodology
Instead of focusing on “neutral summaries,” I’ve shifted to emphasizing transparency + multi-source evidence. The summary is just the last layer; the real value is in surfacing contradictions, missing context, and uncertainty.
I’m also working on:
• A browser extension that runs the analysis on whatever article you’re reading.
• A white-label API that outputs claims + evidence + MNLI verdicts for researchers / journalists.
How it works (technical overview)
Crawling / dedup
Scheduled scrapers + curated source lists. Clustering based on title/body similarity.
Claim extraction
Sentence segmentation → classifier that detects check-worthy clauses (entities, counts, events, quotes, temporal markers).
Evidence retrieval
Sliding window over the article text + heuristics for merging overlapping snippets.
Fact-checking
DeBERTa-based MNLI model over (claim, passage). I’m currently experimenting with better aggregation for multi-passages.
Signals
Bias / sentiment / subjectivity / readability. Transformer classifiers + lightweight feature set.
Stack
Backend in Python + PostgreSQL; front-end in Angular. Server-rendered article pages for SEO + speed.
Where I’m unsure / what I’d love feedback on
1. MNLI limits At what point should I move from vanilla MNLI to something more retrieval-augmented or fine-tuned for journalism-style claims?
2. Claim extraction reliability Is it worth moving toward a more formal IE pipeline (NER + relation extraction + event frames), or does that add more complexity than it solves?
3. Uncertainty communication How would you present “inconclusive” or low-confidence cases to non-technical readers without misleading them?
4. Evaluation methodology What would a convincing benchmark look like? I have offline accuracy for several classifiers, but I haven’t found good public datasets specifically for multi-source contradictory claims.
If you see conceptual flaws or think this approach is risky, I’m genuinely open to hearing strong arguments against it.
Thanks for reading, Marcell