AI Overview
The real meaning of humility is having an accurate, realistic view of oneself, acknowledging both one's strengths and limitations without arrogance or boastfulness, and a modest, unassuming demeanor that focuses on others. It's not about having low self-esteem but about seeing oneself truthfully, putting accomplishments in perspective, and being open to personal growth and learning from others."
Sounds like a good thing to me. Even, winning.
Well done, AI, you've done it.
Or maybe they would learn from feedback to use the system for some kinds of questions but not others? It depends on how easy it is to learn the pattern. This is a matter of user education.
Saying "I don't know" is sort of like an error message. Clear error messages make systems easier to use. If the system can give accurate advice about its own expertise, that's even better.
"I don't know" is not a good error message. "Here's what I know: ..." and "here's why I'm not confident about the answer ..." would be a helpful error message.
Then the question is, when it says "here's what I know, and here's why I'm not confident" -- is it telling the truth, or is that another layer of hallucination? If so, you're back to square one.
If people got discouraged with answers like "it would take at least a decade of expertise..." or other realistic answers they wouldn't waste time fantasizing plans.
Kinda tells all you need to know about the author in this regard.
https://www.anthropic.com/research/language-models-mostly-kn...
The best is its plummeting confidence when beginning the answer to “Why are you alive?”
Big same, Claude.
> This “epidemic” of penalizing uncertain responses can only be addressed through a socio-technical mitigation: modifying the scoring of existing benchmarks that are misaligned but dominate leaderboards
Even more embarrassing, it looks like this is something we beat into models rather than something we can't beat out of them:
> empirical studies (Fig. 2) show that base models are often found to be calibrated, in contrast to post-trained models
That said, I generally appreciate fairly strong bias-to-action and I find the fact that it got slightly overcooked less offensive than the alternative of an undercooked bias-to-action where the model studiously avoids doing anything useful in favor of "it depends" + three plausible reasons why.
> socio-technical mitigation: modifying the scoring of existing benchmarks that are misaligned but dominate leaderboards
Sounds more like we need new leaderboards and old ones should be deprecated
> Like students facing hard exam questions, large language models sometimes guess when uncertain, producing plausible yet incorrect statements instead of admitting uncertainty
This is a de facto false equivalence for two reasons.
First, test takers that are faced with hard questions have the capability of _simply not guessing at all._ UNC did a study on this [^1] by administering a light version of the AMA medical exam to 14 staff members that were NOT trained in the life sciences. While most of the them consistently guessed answers, roughly 6% of them did not. Unfortunately, the study did not disambiguate correct guesses versus questions that were left blank. OpenAI's paper proves that LLMs, at this time of writing, simply do not have the self-awareness of knowing whether they _really_ don't know something, by design.
Second, LLMs are not test takers in the pragmatic sense. They are query answerers. Bar argument settlers. Virtual assistants. Best friends on demand. Personal doctors on standby.
That's how they are marketed and designed, at least.
OpenAI wants people to use ChatGPT like a private search engine. The sources it provides when it decides to use RAG are there more for instilling confidence in the answer instead of encouraging their users to check its work.
A "might be inaccurate" disclaimer on the bottom is about as effective as the Surgeon General's warning on alcohol and cigs.
The stakes are so much higher with LLMs. Totally different from an exam environment.
A final remark: I remember professors hammering "engineering error" margins into us when I was a freshman in 2005. 5% was what was acceptable. That we as a society are now okay with using a technology that has a >20% chance of giving users partially or completely wrong answers to automate as many human jobs as possible blows my mind. Maybe I just don't get it.
Once you train model within specific domain and add to training data out of domain questions or unresolvable questions within domain things will improve.
The question is, is this desirable if most of users grew to love sycophantic confident confabulators.
Most people love human versions of the wonderfully phrased same, so no surprise there.
ricksunny•1h ago
tomrod•36m ago