Building an AI cost-optimizer and AI Slop Prevention tool Looking for feedback."
1•mdzakki•1h ago
Hey — Looking for feedback on my AI cost-optimization + “AI Slop Prevention” tool
I'm Zach, and I’ve been building AI features for a while now. Like many of you, I started noticing the same painful problems every time I shipped anything that used LLMs.
The problem (from a developer’s perspective)
AI bills get out of control fast. Even if you log usage, you still can't answer:
• “Which model is burning money?”
• “Why did this prompt suddenly cost 10× more?”
• “Is this output identical to something we already generated?”
• “Should this request even go to GPT-4, or would Groq/Claude suffice?”
• “Why did the LLM produce 3,000 tokens of slop when I asked for 200?”
• “How do I give my team access without accidentally giving them access to ruin my budget?”
And then there’s AI Slop — unnecessary tokens, verbose responses, hallucinated filler text, or redundant reasoning chains that waste tokens without adding value.
Most teams have no defense against it.
I got tired of fighting this manually, so I started building something small… and it turned into a real product.
Introducing PricePrompter Cloud
A lightweight proxy + devtool that optimizes AI cost, reduces token waste, and prevents AI slop — without changing how you code.
You keep your existing OpenAI/Anthropic calls. We handle the optimization layer behind the scenes.
What it does
1⃣ Smart Routing (UCG Engine)
Send your AI request to PricePrompter → we send it to the cheapest model that satisfies your quality requirements.
• GPT-4 → Claude-Sonnet if equivalent
• GPT-3.5 style → Groq if faster/cheaper
• Or stay on your preferred model with cost warnings
Your code stays unchanged.
2⃣ FREE Semantic Caching
We automatically store/recognize semantically similar requests and return cached results when safe.
You get real observability:
• Cache hits
• Cache misses
• Percentage matched
• Total savings
Caching will always remain free.
3⃣ AI Slop Prevention Engine
This is one of the features I’m most excited about.
We detect:
• Overlong responses
• Repeated sections
• Chain-of-thought that isn’t needed
• Redundant reasoning
• Token inflation
• Hallucinated filler
And we trim, constrain, or guide the LLM to reduce token waste before the request hits your billing.
Think of it as:
“Linting for LLM calls.”
4⃣ Developer Tools (Cursor-style SDK)
A VS Code extension + SDK that gives you:
• Cost per request (live)
• Alternative model suggestions
• Token breakdown
• “Why this request was expensive” explanation
• Model routing logs
• Usage analytics directly in your editor
No need to open dashboards unless you want deeper insights.
5⃣ Team & Enterprise Governance
Practical controls for growing teams:
• Spending limits
• Model-level permissions
• Approval for high-cost requests
• PII masking
• Key rotation
• Audit logs
• Team-level reporting
Nothing enterprise-y in a bad way — just the stuff dev teams actually need.
Who this is for
• Developers building LLM features
• SaaS teams using expensive models
• Startups struggling with unpredictable OpenAI bills
• Agencies running multi-client workloads
• Anyone experimenting with multi-model routing
• Anyone who wants visibility into token usage
• Anyone tired of “AI slop” blowing up their costs
What I’m looking for:
I’d love real feedback from developers:
• Would you trust a proxy that optimizes your LLM cost?
• Is AI slop prevention actually useful in your workflow?
• Is free semantic caching valuable?
• What would make this a must-have devtool?
• What pricing model makes sense for you?
• Any dealbreakers or concerns?
Still shaping the MVP — so your input directly influences what gets built next.
Happy to answer questions or share a preview.
Thanks !
— Zach
ungreased0675•1h ago