Just dropped a new research blog post about "thought engineering"
Explores 2 concepts:
1. Confidence scores in NL when LLMs respond. For example, an LLM returns that it is 50% confident in its answer
2. Grid search iterative refinement to find optimal thresholds of NL confidence scores shown above
It shows that we can improve LLMs performance with thought engineering, and that these models are inherently aware that they are not as confident - a metacognitive ability only measured in humans to date.
pranavc28•10h ago
Explores 2 concepts: 1. Confidence scores in NL when LLMs respond. For example, an LLM returns that it is 50% confident in its answer 2. Grid search iterative refinement to find optimal thresholds of NL confidence scores shown above
It shows that we can improve LLMs performance with thought engineering, and that these models are inherently aware that they are not as confident - a metacognitive ability only measured in humans to date.
Give it a look, DM me your thoughts!