So what is the software development task that this plane excels at? Other than bullshitting one's manager.
It is not supposed to find an answer that matches my persistence, its supposed to tell the truth or admit that it does not know. And even if there is an alabamer in the training set, that is either something else, not a US state, or a misspelling, in neither case should it end up on the list.
You just like the title?
But that doesn't mean that it is not extremely useful. It only means I shouldn't ask it to spell stuff.
If a human is unable to count the n's in 'banana' we expect them to be barely functional. Articles like this one try to draw the same inference about the LLM: it can't count 'n's, so it must not be able to do anything else either.
But it's a bad argument, and I'm tired of hearing it.
Your overall conclusion though seems a little free of context. Average people (i.e. my mom googling something) absolutely do not have the wherewithal to keep track of the various pros and cons of the underlying system that generates the magical giant blue box at the top of their search that has all the answers. They are being deliberately duped by the salesmen-in-chief of these giant companies, as are all of their investors.
LLMs are also bad at many things that humans don't notice immediately.
That is a problem because it leads humans to trust LLMs with tasks at which LLMs currently are bad, such as picking stocks, screening job applicants, providing life advice...
[0] e.g., by promoting AIs as having equivalent capacities of humans of various education levels because they could pass tests that were part of the standards for, and correlate for humans with other abilities of, people with that educational background.
eurekin•3h ago