It runs your prompts or dataset (currently for classification tasks) across multiple models, compares performance vs. cost, and recommends the best value so you’re not wasting tokens or budget.
Would love feedback and ideas for features you’d want before using it in your own workflow!
mihir_ahuja•5mo ago