Like many AI-native applications, Fabricate supports a variety of LLMs (Sonnet, Haiku, Opus, etc.) for various agents and tasks. Users pay for usage by the token. We’d love to use Stripe Billing Meters to charge users by the token — I trust Stripe’s math more than my own when it comes to floating point arithmetic with very large numbers.
But Stripe caps subscriptions at 20 items. Each meter requires its own price, which in turn consumes one of those 20 item slots.
This becomes a problem fast with AI. If you multiply 4-5 models by 5 token types (input, output, cached input, and ephemeral cache tiers), you’re already at 20+ meters. Adding even one more model exceeds the limit.
The workaround we settled on was to move the pricing logic entirely into Fabricate and report abstract "billing units" to Stripe instead of raw tokens.
The setup is pretty simple:
1. We created one single meter in Stripe with a fixed price of $0.00000001 per unit.
2. Internally, we assigned an integer multiplier to every model/token combination (e.g., Sonnet input might be 350 units per token, while Haiku cached input is 10).
3. At reporting time, we do the integer math: (tokens * multiplier) for each category, sum them up, and send that one aggregate number to Stripe.
This effectively decouples our model list from Stripe’s API constraints. We can add as many models or token tiers as we want without ever touching Stripe configuration again.
The obvious downside is that the Stripe dashboard just shows a massive number of "units" rather than a breakdown of what the customer actually used. We decided we didn't care. We have our own internal telemetry for usage analytics; Stripe’s only job is to multiply a number by a price and generate a valid invoice.
If you’re setting up usage-based billing with more than a few dimensions, I’d recommend abstracting your units from day one. It’s much easier than trying to migrate your billing architecture once you’ve already hit that 20-item ceiling. And of course if you're Stripe (hi, Stripe!), I really recommend making changes to your product to offer native support for this very popular use case.