The workflow: Scrape → Clean content → Upload to Gemini → Get permanent queryable knowledge base with automatic citations.
Technical approach: - Intelligent scraper selection from 5 Apify-native scrapers - Automatic content cleaning (strips nav/ads/fluff) - Uploads to Gemini File Search (persistent storage) - Per-page PPE pricing ($0.02 start, $0.0015/page)
Potential use cases: - Turn documentation into AI chatbots - Make company wikis naturally searchable - Build RAG apps without managing vector databases
Basically streamline your Gemini file search RAG ingestion with an Apify scraper run AIO.
You can also employ the actor programmatically with agents using Apify's actor mcp.
Built this over a weekend for the Apify 1M Actor Challenge. It's my first Apify actor, so curious to hear if the pricing makes sense.
Note* There is a banned websites to scrape list filter due to the constraints of the challenge. This will be lifted after the challenge ends (Jan 31, 2026).