What is causing this limitation? If a basic online word count tool can do this, why can't these big companies do this?
What is causing this limitation? If a basic online word count tool can do this, why can't these big companies do this?
and even with tokens they don't know how to count them at the LLM completion layer
they have to be trained with something like RLHF about word counting at the question answering / instruction following layers
or at the application layer (so called "agentic workflows"), e.g. writing a Python code to count words, or calling a function or a CLI tool like "wc"
So they'd need to do complex recall on resources of language structure it was trained on to be able to count accurately.
My picture over LLMs is this: I like to imagine what LLMs do is close to us trying to learn language from a dictionary of an alien language. We couldn't ground anything in reality, we maybe wouldn't know where words start or end in the definitions, but we can pattern match enough stuff to be useful for an alien giving us text queries.
I also asked GPT for a metaphor, and it came back with these:
- It’s like trying to clap to music and being asked, “Make it 100 words worth of claps.” You’re working with rhythm, not actual word units, so your sense of count is fuzzy.
- LLMs are excellent at flowing language but bad at rigid constraints — like a jazz musician who can improvise beautifully but can’t stop exactly on the 137th note without counting.
viraptor•5mo ago
They're not trained for that. And there's no good reason to improve it if you can instead rerun the paragraph saying "make this slightly shorter".
> If a basic online word count tool can do this
It's an entirely different technology and not comparable at all. If you want to involve an actual word counter, this is not hard to integrate, with a basic loop that measures the output and feeds back the result so that the LLM can shorten/lengthen the text automatically before returning to you.