As in, this entire cloud buildout is unnecessary because it becomes like using a calculator.
Reach out to chat.
The probes I used seem to help identify good configurations, but are quite noisey. A small probe set was initially used to make the scan tractable, and then the higher ranked models were retested on a set ~10x larger.
This work here is obviously more complex than that, but suggests something similar is going on with early layers transforming to some sort of generalized basis functions defining a universal language representation.
It turns out this does not help (somewhat surprisingly).
he learnt icelandic in week and had a fluent conversation on their national TV to prove it. (this is nuts, that language is extremely difficult to pickup with nasal sounds etc.)
ofcourse i guess its not even close to average to have such a abilities as a human, but i wonder if at some point LLMs and AI algorithms and models might shed light on such kind of abstractions (like some mentioned in comments also about image recognition algos) that might help humans actually learn these things themselves, train on them and perhaps even get taught such a thing as a skill.
JPLeRouzic•1h ago
dnhkng•1h ago
I am working with TurboDerp to integrate this into the Exllama v3 format.