I'm very curious about how they trained the lightweight classifier to decide model switching. Is it supervised? Did they use LLM as a teacher? It also seems that the featurization isn't trivial. Like how you build a simple but still meaningful representation of the task(s).
fanyangxyz33•59m ago