Edit: I never actually expected AGI from LLMs. That was snark. I just think it's notable that the fundamental gains in LLM performance seem to have dried up.
But why does this paper impact your thinking on it? It is about budget and recognizing that different LLMs have different cost structures. It's not really an attempt to improve LLM performance measured absolutely.
It's mostly hand waving, hype and credulity, and unproven claims of scalability right now.
You can't move the goal posts because they don't exist.
And most would have accept the recommendation because the model sold it as less common tactic, while sounding very logical.
Once you've started to argue with an LLM you're already barking up the wrong tree. Maybe you're right, maybe not, but there's no point in arguing it out with an LLM.
So many people just want to believe, instead of the reality of LLMs being quite unreliable.
Personally it's usually fairly obvious to me when LLMs are bullshitting probably because I have lots of experience detecting it in humans.
arxiv is essentially a blog under an academic format, popular amongst asian and south asian academic communities
currently you can launder reputation with it, just like “white papers” in the crypto world allowed for capital for some time
this ability will diminish as more people catch on
While technically true why would you want to use it when OpenAI itself provides a bunch of many times cheaper and better models?
I heard the best way is through valuations
Rather than the much more obvious: Preference-prior Informed Linucb For Adaptive Routing (PILFAR)
fny•2h ago
delichon•2h ago
Aka Wisdom. No, LLMs don't have that. Me neither, I usually have to step in the rabbit holes in order to detect them.
fny•1h ago
nutjob2•28m ago
jibal•1h ago
mhh__•1h ago
ashirviskas•6m ago