Routing LLM queries using internal success predictions (70% cost reduction)

1•stansApprentice•1h ago

Comments

stansApprentice•1h ago

Author here. The core idea is pretty simple: train linear probes on the model's internal state before it generates anything to predict if it'll succeed. Then use those predictions to route queries: Send easy ones to cheap inference, hard ones to expensive reasoning.

Two findings that surprised us:

1. The same model has completely different internal representations of "difficulty" depending on decoding settings. What GPT-oss thinks is hard with greedy ≠ what it thinks is hard with sampling.

2. Model difficulty and human difficulty are orthogonal. The problems they struggle with aren't the ones we struggle with, and this gap increases with extended reasoning.

Code: https://github.com/KabakaWilliam/llms_know_difficulty Probes: https://huggingface.co/CoffeeGitta/pika-probes

Happy to answer questions.