If a book or movie is ever made about the history of AI, the script would include this period of AI history and would probably go something like this…
(Some dramatic license here, sure. But not much more than your average "based on true events" script.)
In 1957, Frank Rosenblatt built a physical neural network machine called the Perceptron. It used variable resistors and reconfigurable wiring to simulate brain-like learning. Each resistor had a motor to adjust weights, allowing the system to "learn" from input data. Hook it up to a fridge-sized video camera (20x20 resolution), train it overnight, and it could recognize objects. Pretty wild for the time.
Rosenblatt was a showman—loud, charismatic, and convinced intelligent machines were just around the corner.
Marvin Minsky, a jealous academic peer of Frank, was in favor of a different approach to AI: Expert Systems. He published a book (Perceptrons, 1969) which all but killed research into neural nets. Marvin pointed out that no neural net with a depth of one layer could solve the "XOR" problem.
While the book's findings and mathematical proof were correct, they were based on incorrect assumptions (that the Perceptron only used one layer and that algorithms like backpropagation did not exist).
As a result, a lot of academic AI funding was directed towards Expert Systems. The flagship of this was the MYCIN project. Essentially, it was a system to find the correct antibiotic based on the exact bacteria a patient was infected with. The system thus had knowledge about thousands and thousands of different diseases with their associated symptoms. At the time, many different antibiotics existed, and using the wrong one for a given disease could be fatal to the patient.
When the system was finally ready for use... after six years (!), the pharmaceutical industry had developed “broad-spectrum antibiotics,” which did not require any of the detailed analysis MYCIN was developed for.
The period of suppressing Neural Net research is now referred to as (one of) the winter(s) of AI.
--------
As said, that is the fictional treatment. In reality, the facts, motivations, and behavior of the characters are a lot more nuanced.
I went through Stanford CS when those guys were in charge. It was starting to become clear that the emperor had no clothes, but most of the CS faculty was unwilling to admit it. It was really discouraging. Peak hype was in "The fifth generation: artificial intelligence and Japan's computer challenge to the world" (1983), by Feigenbaum. (Japan at one point in the 1980s had an AI program which attempted to build hardware to run Prolog fast.)
Trying to use expert systems for medicine lent an appearance of importance to something that might work for auto repair manuals. It's mostly a mechanization of trouble-shooting charts. It's not totally useless, but you get out pretty much what you carefully put in.
The main barrier to scaling was workflow integration due to lack of electronic data, and if it was available, interoperability (as it is today). The other barriers were problems with maintenance and performance monitoring, which are still issues today in healthcare and other industries.
I do agree the 5th Generation project never made sense, but as you point out they had developed hardware to accelerate Prolog and wanted to show it off and overused the tech. Hmmm, sounds familiar...
The paper of Ueda they cite is so lovely to read, full of marvelous ideas:
Ueda K. Logic/Constraint Programming and Concurrency: The hard-won lessons of the Fifth Generation Computer project. Science of Computer Programming. 2018;164:3-17. doi:10.1016/j.scico.2017.06.002 open access: https://linkinghub.elsevier.com/retrieve/pii/S01676423173012...
No statistical dependency parser came near it accuracy-wise until BERT/RoBERTa + biaffine parsing.
> Worse, it seems other researchers deliberately stayed away. John McCarthy, who coined the term “artificial intelligence”, told Piccinini that when he and fellow AI founder Marvin Minsky got started, they chose to do their own thing rather than follow McCulloch because they didn’t want to be subsumed into his orbit.
[1] https://www.newscientist.com/article/mg23831800-300-how-a-fr...
I guess it depends on what you mean by "documented". If you're talking about a historical retrospective, written after the fact by a documentarian / historian, then you're probably correct.
But in terms of primary sources, I'd say it's fairly well documented. A lot of the original documents related to the earlier days of AI are readily available[1]. And there are at least a few books from years ago that provide a sort of overview of the field at that moment in time. In aggregate, they provide at least a moderate coverage of the history of the field.
Consider also that the term "History of Artificial Inteligence" has its own Wikipedia page[2] which strikes me as reasonably comprehensive.
[1]: Here I refer to things like MIT CSAIL "AI Memo series"[3] and related[4][5], the Proceedings of the International Joint Conference on AI[6], the CMU AI Repository[7], etc.
[2]: https://en.wikipedia.org/wiki/History_of_artificial_intellig...
[3]: https://dspace.mit.edu/handle/1721.1/5460/browse?type=dateis...
[4]: https://dspace.mit.edu/handle/1721.1/39813
[5]: https://dspace.mit.edu/handle/1721.1/5461
[6]: https://www.ijcai.org/all_proceedings
[7]: https://www.cs.cmu.edu/Groups/AI/html/rep_info/intro.html
BTW the ad hoc treatment of uncertainty in Mycin (certainty factors) motivated the work of Bayesian network.
I would love to see a "Halt and Catch Fire" style treatment of this era.
Marvin Minsky, a jealous academic peer of Frank, was in favor of a different approach to AI: Expert Systems. He published a book (Perceptrons, 1969) which all but killed research into neural nets. Marvin pointed out that no neural net with a depth of one layer could solve the "XOR" problem.
I think a lot of people have an impression - an impression that I shared until recently - that the Perceptrons book was a "hit piece" aimed at intentionally destroying interest in the perceptron approach. But having just finished reading the Parallel Distributed Processing book and being in the middle of reading Perceptrons right now, I no longer fully buy that. Now the effect may well have been what is widely known. But Minsky and Papert don't really seem to be as "anti-perceptron" as the "received wisdom" suggests.
I'm in no way an expert but I feel that today's LLMs lack some concepts well known in the research of logical reasoning. Something like: semantic.
And what's remarkable about LLMs is exactly that: they don't reason like machines. They don't use the kind of hard machine logic you see in an if-else chain. They reason using the same type of associative abstract thinking as humans do.
"[LLMs] reason using the same type of associative abstract thinking as humans do": do you have a reference for this bold statement?
I entered "associative abstract thinking llm" in a good old search engine. The results point to papers rather hinting that they're not so good at it (yet?), for example: https://articles.emp0.com/abstract-reasoning-in-llms/.
But the closest thing is probably Anthropic's famous interpretability papers:
https://transformer-circuits.pub/2024/scaling-monosemanticit...
https://transformer-circuits.pub/2025/attribution-graphs/bio...
In which Anthropic finds circuits in an LLM that correspond to high level abstracts an LLM can recognize and use, and traces down the way they can be connected. Which forms the foundation of associative abstract thinking.
Code:
https://github.com/norvig/paip-lisp
Book
https://archive.org/details/github.com-norvig-paip-lisp_-_20...
As for the interpreter, SBCL works fine everywhere; if not, pick ECL and CCL.
The simplest example and the one I usually bring up is log files, where the primary delimiter is \n and the secondary is likely some whitespace, which can easily be replaced with Prolog delimiters and a bit of decoration. This turns the data into Prolog code which can be consulted as is and complemented with rules abstracting complex queries.
Something similar can be done with JSON files.
bane•4mo ago
Our core system was built of thousands upon thousands of hand-crafted rules informed by careful statistical analysis of hundreds of millions of entries in a bulk data system.
Part of my job was to build the system that analyzed the bulk data and produced the stats, and the other part was carefully testing and fixing the rulesets for certain languages. It was mind-numbing work, and looking back we were freakishly close to all the bit and pieces needed for then bleeding-edge ML had we chosen to go that way.
However, we chose expert systems because it gave us tremendous insight into what was happening, and the opportunity to debug and test things at an incredibly granular scale. It was fully possible to say "the system has this behavior because of xyz" and it was fully possible to tune the system at individual character levels of finesse.
Had we wanted to dive into ML, we could have used this foundation as a bootstrap into building a massive training set. But the founders biased towards expert systems and I think, at the time, it was the right choice.
The technology was acquired, and I wonder if the current custodians use it for those obvious next-step purposes.
nowittyusername•4mo ago
ozim•4mo ago
The maintenance of the rules or for you scripts for complex tasks is much more work than anyone is willing to commit to. Also big problem was finding out tacit knowledge and no one was able to code that reliably in.
ML today is promising you won’t have to hand code the rules you just push data and system finds out what the rules are and then can handle new data.
I don’t have to code the rules to check if there is a cat in the picture - that definitely works. Making rules on data that is not so often found on the internet that’s still going to be a hassle. Rules change and world change and for example knowledge cut off is I think still a problem.
In the end yes you can build nice system for some use case where you plugin LLM for classification and you most likely will make money on it. This just won’t be „what was promised” so AGI and we are stuck with this promise and a lot of people won’t accept less than that.
arethuza•4mo ago
nowittyusername•4mo ago
mindcrime•4mo ago
I agree with that. In fact, that mindset is what led me to this book in the first place. I was exploring an older book on OPS5[1] and saw this book mentioned, and started looking for it and found that it is freely available online. Seemed like something the HN crowd might enjoy, so here we are.
And that is making me look more and more in to old techniques that have been long established way back when...
I suspect that there is some meat on that bone. I'm exploring this particular area as well. I think there's some opportunity for hybridization between LLM's / GenAI and some of these older approaches.
[1]: https://en.wikipedia.org/wiki/OPS5
mark_l_watson•4mo ago
ACCount37•4mo ago