Building an AI cost-optimizer and AI Slop Prevention tool Looking for feedback."

1•mdzakki•1mo ago

Hey — Looking for feedback on my AI cost-optimization + “AI Slop Prevention” tool I'm Zach, and I’ve been building AI features for a while now. Like many of you, I started noticing the same painful problems every time I shipped anything that used LLMs. The problem (from a developer’s perspective) AI bills get out of control fast. Even if you log usage, you still can't answer: • “Which model is burning money?” • “Why did this prompt suddenly cost 10× more?” • “Is this output identical to something we already generated?” • “Should this request even go to GPT-4, or would Groq/Claude suffice?” • “Why did the LLM produce 3,000 tokens of slop when I asked for 200?” • “How do I give my team access without accidentally giving them access to ruin my budget?” And then there’s AI Slop — unnecessary tokens, verbose responses, hallucinated filler text, or redundant reasoning chains that waste tokens without adding value. Most teams have no defense against it. I got tired of fighting this manually, so I started building something small… and it turned into a real product. Introducing PricePrompter Cloud A lightweight proxy + devtool that optimizes AI cost, reduces token waste, and prevents AI slop — without changing how you code. You keep your existing OpenAI/Anthropic calls. We handle the optimization layer behind the scenes. What it does 1⃣ Smart Routing (UCG Engine) Send your AI request to PricePrompter → we send it to the cheapest model that satisfies your quality requirements. • GPT-4 → Claude-Sonnet if equivalent • GPT-3.5 style → Groq if faster/cheaper • Or stay on your preferred model with cost warnings Your code stays unchanged. 2⃣ FREE Semantic Caching We automatically store/recognize semantically similar requests and return cached results when safe. You get real observability: • Cache hits • Cache misses • Percentage matched • Total savings Caching will always remain free. 3⃣ AI Slop Prevention Engine This is one of the features I’m most excited about. We detect: • Overlong responses • Repeated sections • Chain-of-thought that isn’t needed • Redundant reasoning • Token inflation • Hallucinated filler And we trim, constrain, or guide the LLM to reduce token waste before the request hits your billing. Think of it as: “Linting for LLM calls.” 4⃣ Developer Tools (Cursor-style SDK) A VS Code extension + SDK that gives you: • Cost per request (live) • Alternative model suggestions • Token breakdown • “Why this request was expensive” explanation • Model routing logs • Usage analytics directly in your editor No need to open dashboards unless you want deeper insights. 5⃣ Team & Enterprise Governance Practical controls for growing teams: • Spending limits • Model-level permissions • Approval for high-cost requests • PII masking • Key rotation • Audit logs • Team-level reporting Nothing enterprise-y in a bad way — just the stuff dev teams actually need. Who this is for • Developers building LLM features • SaaS teams using expensive models • Startups struggling with unpredictable OpenAI bills • Agencies running multi-client workloads • Anyone experimenting with multi-model routing • Anyone who wants visibility into token usage • Anyone tired of “AI slop” blowing up their costs What I’m looking for: I’d love real feedback from developers: • Would you trust a proxy that optimizes your LLM cost? • Is AI slop prevention actually useful in your workflow? • Is free semantic caching valuable? • What would make this a must-have devtool? • What pricing model makes sense for you? • Any dealbreakers or concerns? Still shaping the MVP — so your input directly influences what gets built next. Happy to answer questions or share a preview. Thanks ! — Zach

Comments

ungreased0675•1mo ago

Do you have a demo?

mdzakki•1mo ago

I am still working on the MVP, It's gonna ready soon

Learning to Reason in 13 Parameters

Convergent Discovery of Critical Phenomena Mathematics Across Disciplines

Ask HN: Will GPU and RAM prices ever go down?

From hunger to luxury: The story behind the most expensive rice (2025)

Substack makes money from hosting Nazi newsletters

A New Crypto Winter Is Here and Even the Biggest Bulls Aren't Certain Why

Moltbook was peak AI theater

Why Claude Cowork is a math problem Indian IT can't solve

Show HN: Built an space travel calculator with vanilla JavaScript v2

Why a 175-Year-Old Glassmaker Is Suddenly an AI Superstar

Micro-Front Ends in 2026: Architecture Win or Enterprise Tax?

These White-Collar Workers Actually Made the Switch to a Trade

The Wonder Drug That's Plaguing Sports

Show HN: Which chef knife steels are good? Data from 540 Reddit tread

Federated Credential Management (FedCM)

Token-to-Credit Conversion: Avoiding Floating-Point Errors in AI Billing Systems

The Story of Heroku (2022)

Obey the Testing Goat

Claude Opus 4.6 extends LLM pareto frontier

Brute Force Colors (2022)

Google Translate apparently vulnerable to prompt injection

(Bsky thread) "This turns the maintainer into an unwitting vibe coder"

Software development is undergoing a Renaissance in front of our eyes

Can you beat ensloppification? I made a quiz for Wikipedia's Signs of AI Writing

Spec-Driven Design with Kiro: Lessons from Seddle

Agents need good developer experience too

The Dark Factory

Free data transfer out to internet when moving out of AWS (2024)

Interop 2025: A Year of Convergence

Prejudice Against Leprosy