So, to give an example of my idea, as some people asked why is there a "reasoning model" when I said non-reasoning. The "reasoning model" is just an instant model trained to output text that looks like reasoning.
user: "how many Rs are in strawberry" (A)
↓
reasoning model(A): "The user is asking to count the Rs in 'strawberry'. s t r a w b e r r y, I see 3 Rs. Let me double check stRawbeRRy. Yes, 3 Rs." (B)
↓
summarization model(B): "The answer is 3 Rs in 'strawberry'" (C)
↓
answer model(A,C): "There are 3 Rs in 'strawberry'."
XCSme•56m ago
user: "how many Rs are in strawberry" (A)
↓ reasoning model(A): "The user is asking to count the Rs in 'strawberry'. s t r a w b e r r y, I see 3 Rs. Let me double check stRawbeRRy. Yes, 3 Rs." (B)
↓ summarization model(B): "The answer is 3 Rs in 'strawberry'" (C)
↓ answer model(A,C): "There are 3 Rs in 'strawberry'."