I built this autonomous pipeline to see if agentic orchestration could replicate a high-quality editorial desk with zero manual overhead. This is a a tech news stream that removes the "noise" (deals, opinions, fluff) using a multi-model agentic approach.
The Agentic Pipeline (runs every 2 hour):
I custom-coded the orchestration to swap LLMs based on their specific strengths:
1. Discovery: Scrapes raw feeds, removes duplicates, and checks against the published cache.
2. Classification (default:Gemini): Filters out non-tech news and "opinion" pieces. Gemini's context window makes it great for high-volume filtering.
3. Prioritization: Selects the top 5 most impactful stories from the filtered list.
4. Authoring (default:GPT-4o): Drafts the report based on the raw facts provided by the Discovery agent.
5. Proofreader (default:Sonnet 3.5): Handles the final edit to ensure a human-like tone and fact-checks against the source.
The Lean Tech Stack:
- Backend: Custom Python orchestration.
- Publishing: WordPress API (Website) + X API (Twitter) + Zapier (LinkedIn).
- Stateless: I bypass a local database entirely, using the WordPress REST API as my primary content store.
- Optimized: A "Non-News Cache" prevents re-processing URLs already identified as noise, saving in token costs.
Every post starts with a disclaimer and cites the original sources. Currently, it's 100% automated and has grown to 50 organic followers.
I'd love to hear feedback on the "agentic" logic or how I can better handle potential classification hallucinations!
vivzkestrel•1h ago
- define "impactful" ? how do you what is impactful and what is not, where is the threshold for it?
siddkgn•1h ago
The Agentic Pipeline (runs every 2 hour):
I custom-coded the orchestration to swap LLMs based on their specific strengths:
1. Discovery: Scrapes raw feeds, removes duplicates, and checks against the published cache.
2. Classification (default:Gemini): Filters out non-tech news and "opinion" pieces. Gemini's context window makes it great for high-volume filtering.
3. Prioritization: Selects the top 5 most impactful stories from the filtered list.
4. Authoring (default:GPT-4o): Drafts the report based on the raw facts provided by the Discovery agent.
5. Proofreader (default:Sonnet 3.5): Handles the final edit to ensure a human-like tone and fact-checks against the source.
The Lean Tech Stack:
- Backend: Custom Python orchestration.
- Publishing: WordPress API (Website) + X API (Twitter) + Zapier (LinkedIn).
- Stateless: I bypass a local database entirely, using the WordPress REST API as my primary content store.
- Optimized: A "Non-News Cache" prevents re-processing URLs already identified as noise, saving in token costs.
Every post starts with a disclaimer and cites the original sources. Currently, it's 100% automated and has grown to 50 organic followers.
I'd love to hear feedback on the "agentic" logic or how I can better handle potential classification hallucinations!
vivzkestrel•1h ago