- microsoft build supercomputer cluster for OpenAI to train gpt3
- gpt-3 was regarded as moonshot and BERT was hot buzz in those days
- gpt-3 bots spotted on 4chan
- google fired a guy claiming AI was "sentient"
That said, there's a fairly good "history of AI" page[3] at Wikipedia that covers a lot of the early material.
[1]: https://en.wikipedia.org/wiki/Dartmouth_workshop
[2]: https://arxiv.org/abs/2212.11279
[3]: https://en.wikipedia.org/wiki/History_of_artificial_intellig...
lol
NHLOCAL•4h ago
It’s updated monthly and covers the key developments across models, tools, and breakthroughs in a clear, no-frills format.
MIT licensed and fully open for contributions. Ideal for staying oriented in the chaos.
buster•3h ago
One nitpick from my side: It's not clear to me what the difference between blue and red dots are in the list...
NHLOCAL•2h ago
bArray•3h ago
I think it would be better to list pivotable events, i.e. significant changes in architecture or approach, or a benchmark being surpassed. Otherwise this is just a list of models released by big tech companies irrespective of their importance.
Also I think I would prefer it be renamed from "Artificial Intelligence Timeline" when it's only since 2022 and AI encapsulates a hell of a lot of important work other than LLMs. I've been around long enough to see a few AI bubbles now.
Important events of AI are definitely missing anyway, for example when in 1956 some of the greatest minds in the field of the time set about trying to solve AI [1], only to realise it was far more complex than they imagined. It's only now we even approach addressing some of those original aims, some 70 years later.
[1] https://www-formal.stanford.edu/jmc/history/dartmouth/dartmo...
NHLOCAL•2h ago
There’s actually a tendency among experts in the field to underestimate the power and potential of current AI progress.
NitpickLawyer•2h ago
I don't think it's that easy to dismiss changes in recipes, data and training regime, tbh. There have been plenty of "huh, that's strange" moments with LLMs even without major breakthroughs in the overall architecture. The case of one model being really strong at chess while others from the same family (larger to boot) aren't is such a "huh" moment for me. Or the new "reasoning" thing, where it's obvious the new models are better at some tasks than the previous gen. Training steps is another interesting one, where two teams - one fine-tuning on 100k+ samples and one fine-tuning on 1k samples w/ 15 epochs and getting similar performance is also a good example. Or the (extremely readable) simple paper coming out of meta fair ~6 mo ago where they found that repeating some (5-10%) of the samples leads to much better generalisation. All of these are "huh" worthy, and I don't think we thoroughly understand why they happen, or why they sometimes happen and not other times (i.e. w/ 20+T training sets on massively scaled LLMs like the purported gpt5).
NHLOCAL•2h ago