What is the size of this pool, ie how many GPUs would it take for an individual user to be able to run their own equivalent today? Let's assume the LLM is fully downloadable.
I ask, because, if LLMs stop improving exponentially, surely soon enough we will ALL be able to run un-quantised local LLMs of sufficient quality for day to day tasks.