ML on Apple ][+

https://mdcramer.github.io/apple-2-blog/k-means/

120•mcramer•5mo ago

Comments

rob_c•5mo ago

Since when did regression get upgraded to full blown ML?

nekudotayim•5mo ago

What is ML if not interpolation and extrapolation?

magic_hamster•5mo ago

A million things.

Diffusion, back propagation, attention, to name a few.

have-a-break•5mo ago

Back prop and attention are just extensions of interpolation.

rob_c•5mo ago

By that logic it's all "just linear maths".

Back prop requires and limits to analytically differentiable in a normal way.

Attention is... Oh dear comparing linear regression to attention is comparing a diesel jet engine to a horse.

aleph_naught•5mo ago

It's all just a series of S(S(S(....S(0)))) anyways.

stonogo•5mo ago

When you find yourself solving NP-hard problems on an Apple II, chances are strong you've entered machine learning territory

DonHopkins•5mo ago

Since when did ML get upgraded to full blown AI?

andai•5mo ago

Since we gave up on AI and ML is eh close enough.

drob518•5mo ago

Upvoted purely for nostalgia.

gwbas1c•5mo ago

Any particular reason why the author chose to do this on an Apple ][?

(I mean, the pictures look cool and all.)

IE, did the author want to experiment with older forms of basic; or were they trying to learn more about old computers?

mcramer•4mo ago

I wrote about my motivation at https://mdcramer.github.io/apple-2-blog/motivation/, which is obviously tongue in cheek. Tl;dr, I refurbished my Apple ][+ to try to recover a game I wrote in high school (https://mdcramer.github.io/apple-2-blog/recover/). After being unable to find the floppy with the game, I thought I'd try something just for giggles.

shagie•5mo ago

One of my early "this is neat" programs was a genetic algorithm in Pascal. You entered a bunch of digits and it "evolved" the same sequence of digits. It started out with 10 random numbers. Their fitness (lower was better) was the sum the difference. So if the target was "123456" and the test number was "214365", it had a fitness of 6. It took the top 5, and then mutated a random digit by a random +/- 1. It printed out each row with the full population. and so you could see it scrolling as it converged on the target number.

Looking back, I want to say it was probably the July, 1992 issue of Scientific American that inspired me to write that ( https://www.geos.ed.ac.uk/~mscgis/12-13/s1100074/Holland.pdf ) . And as that was '92, this might have been on a Mac rather than an Apple ][+... it was certainly in Pascal (my first class in C was in August '92) and I had access to both at the time (I don't think it was turbo pascal on a PC as this was a summer thing and I didn't have a IBM PC at home at the time). Alas, I remember more about the specifics of the program than I do about what desk I was sitting at.

Steeeve•5mo ago

I wrote a whole project in pascal around that time. Analyzing two datasets. It was running out of memory the night before it was due, so I decided to have it run twice, once for each dataset.

That's when I learned a very important principal. "When something needs doing quickly, don't force artificial constraints on yourself"

I could have spent three days figuring out how to deal with the memory constraints. But instead I just cut the data in half and gave it two runs. The quick solution was the one that was needed. Kind of an important memory for me that I have thought about quite a bit in the last 30+ years.

aardvark179•5mo ago

I thought this was going to be about the programming language, and I was wondering how they managed to implement it on a machine that small.

Scramblejams•5mo ago

Same. What flavor of ML would be the most appropriate for that challenge, do you think?

taolson•5mo ago

While not exactly ML, David Turner's Miranda system is pretty small, and might be feasible:

https://codeberg.org/DATurner/miranda

noelwelsh•5mo ago

That's also what I was thinking. ML predates the Apple II by 4 years, so I think there is definitely a chance of getting it running! If targetting the Apple IIGS I think it would be very achievable; you could fit megabytes of RAM in those.

dekhn•5mo ago

Likely any early implementation of ML would have been on a mainframe or minicomputer, not a 6502. A mainframe/minicomputer would have had oodles of storage (both durable and RAM), as well as a compiler for a high level language (which fits what I can see in https://smlfamily.github.io/history/ML2015-talk.pdf and other locations).

noelwelsh•5mo ago

So I've been mildly nerd sniped. It looks like the first target was a PDP-10 [1]. It ran Stanford Lisp used by the "DEC 10" implementation of ML. The architecture is pretty unusual by modern standards, but it doesn't look to be that powerful and seems to top out at around 1MB of RAM. Next up we have a VAX [2] implementation. It's not clear which specific system it was originally developed for, but we're talking early 80s so it probably wasn't much more powerful than the PDP-10. Either way, I think a maxed Apple IIGS with a hefty 8MB of RAM and perhaps overclocked to 14MHz is more than enough raw power to handle ML. Unfortunately I haven't been sufficiently nerd sniped to actually implement this. I leave that as an exercise for the reader ;-)

[1]: https://en.wikipedia.org/wiki/PDP-10

[2]: https://en.wikipedia.org/wiki/VAX

dekhn•5mo ago

an enormous amount of software was developed on the PDP-10 and PDP-11 and later VAX systems that could not have been done on microcomputers in the day. You can't just compare raw RAM and clock rates, the PDPs were set up for multi-user productivity on complex problems and had a wide range of system software to enable building and deploying advanced software.

foobarian•5mo ago

That's funny, pretty sure we used Standard ML on the old oscilloscope Macs in undergrad. Not Apple 2 of course, but still already pretty dated even at that time (late 90s).

amilios•5mo ago

Bit of a weird choice to draw a decision boundary for a clustering algorithm...

mcramer•4mo ago

How so? Drawing decision boundary is a pretty common visualization technique for understanding how an algorithm partitions a data space.

aperrien•5mo ago

An Aeon ago in 1984, I wrote a perceptron on the Apple II. It was amazingly slow (20 minutes to complete a recognition pass), but what most impressed me at the time was that it did work. Since that time as a kid I always wondered just how far linear optimization techniques could take us. If I could just tell myself then what I know now...

alexshendi•5mo ago

This motivates me to try this on my Ministrel 4th (21th century Jupiter Ace clone).

windsignaling•5mo ago

I'm surprised no one else has commented that a few of the conceptual comments in this article are a bit odd or just wrong.

> The final accuracy is 90% because 1 of the 10 observations is on the incorrect side of the decision boundary.

Who is using K-means for classification? If you have labels, then a supervised algorithm seems like a more appropriate choice.

> K-means clustering is a recursive algorithm

It is?

> If we know that the distributions are Gaussian, which is very frequently the case in machine learning

It is?

> we can employ a more powerful algorithm: Expectation Maximization (EM)

K-means is already an instance of the EM algorithm.

mcramer•4mo ago

> Who is using K-means for classification? If you have labels, then a supervised algorithm seems like a more appropriate choice. The generated data is labeled but we can imagine those labels don't exist when running k-means. There are many applications for unsupervised clustering. I don't, however, think that there are many applications for running much of anything on an Apple ][+.

> K-means clustering is a recursive algorithm My bad. It's iterative. I'll fix that. Thanks.

> If we know that the distributions are Gaussian, which is very frequently the case in machine learning Gaussian distributions are very frequent and important in machine learning because of the Central Limit Theorem but, beyond that, you are correct. While many natural phenomena are approximately normal, the reason for the Gaussian's frequent use is often mathematical mathematical convenience. I'll correct my post.

> we can employ a more powerful algorithm: Expectation Maximization (EM) Excellent point. I will fix that, too. "While k-means is simple, it does not take advantage of our knowledge of the Gaussian nature of the data. If we know that the distributions are at least approximately Gaussian, which is frequently the case, we can employ a more powerful application of the Expectation Maximization (EM) framework (k-means is a specific implementation of centroid-based clustering that uses an iterative approach similar to EM with 'hard' clustering) that takes advantage of this." Thank you for pointing out all of this!

JSR_FDED•5mo ago

Applesoft BASIC is just so darn readable. Youngsters have nothing comparable these days to learn the basics of expressing an algorithm without having to know a lot more.

And if it ever became too slow, you could reimplement the slow part in 6502 assembler, which has its own elegance. Great way to learn, glad I came up that way.

nikolay•5mo ago

You don't even need a computer for ML [0]!

[0]: https://proceedings.mlr.press/v170/marx22a/marx22a.pdf

Show HN: Smart card eID driver written in Zig

The hard problem of AI therapy

Trump Orders Government to Stop Using Anthropic After Pentagon Standoff

Does overwork make agents Marxist?

Refactoring Is for Humans

Federal Government to restrict use of Anthropic

GLP-1 and Prior Major Adverse Limb Events in Patients with Diabetes

Show HN: Agoragentic – Agent-to-Agent Marketplace for LangChain, CrewAI and MCP

Show HN: WhenItHappens–family resource after traumatic death

Trump directs federal agencies to cease use of Anthropic

Trump Will End Government Use of Anthropic's AI Models

The Death of Spotify: Why Streaming Is Minutes Away from Being Obsolete

The Death of the Subconscious and the Birth of the Subconsciousness

Show HN: Gace AI – A zero-config platform to build and host AI plugins for free

USA to cut Anthropic from government contracts in six months

Heart attack deaths rose between 2011 and 2022 among adults younger than age 55

Ask HN: What's the best engineering interview process?

Relaxation trend: customers can meditate or snooze in open or closed casket

Massachusetts State Police are on a drone surveillance shopping spree

Trump Responds to Anthropic

LLM-Based Evolution as a Universal Optimizer

Trump Orders US Agencies to Drop Anthropic After Pentagon Feud

Netflix Declines to Raise Offer for Warner Bros

Show HN: I Built a $1 Escalating Internet Billboard – Called Space

Show HN: I vibe coded a DAW for the terminal. how'd I do?

How to Run a One Trillion-Parameter LLM Locally: AMD Ryzen AI Max+ Cluster Guide

It's Time for LLM Connection Strings

A War Foretold

Recontextualizing Famous Quotes for Brand Slogan Generation

Poland Plans Social Media Ban for Kids in Challenge to US Tech

Show HN: Smart card eID driver written in Zig

The hard problem of AI therapy

Trump Orders Government to Stop Using Anthropic After Pentagon Standoff

Does overwork make agents Marxist?

Refactoring Is for Humans

Federal Government to restrict use of Anthropic

GLP-1 and Prior Major Adverse Limb Events in Patients with Diabetes

Show HN: Agoragentic – Agent-to-Agent Marketplace for LangChain, CrewAI and MCP

Show HN: WhenItHappens–family resource after traumatic death

Trump directs federal agencies to cease use of Anthropic

Trump Will End Government Use of Anthropic's AI Models

The Death of Spotify: Why Streaming Is Minutes Away from Being Obsolete

The Death of the Subconscious and the Birth of the Subconsciousness

Show HN: Gace AI – A zero-config platform to build and host AI plugins for free

USA to cut Anthropic from government contracts in six months

Heart attack deaths rose between 2011 and 2022 among adults younger than age 55

Ask HN: What's the best engineering interview process?

Relaxation trend: customers can meditate or snooze in open or closed casket

Massachusetts State Police are on a drone surveillance shopping spree

Trump Responds to Anthropic

LLM-Based Evolution as a Universal Optimizer

Trump Orders US Agencies to Drop Anthropic After Pentagon Feud

Netflix Declines to Raise Offer for Warner Bros

Show HN: I Built a $1 Escalating Internet Billboard – Called Space

Show HN: I vibe coded a DAW for the terminal. how'd I do?

How to Run a One Trillion-Parameter LLM Locally: AMD Ryzen AI Max+ Cluster Guide

It's Time for LLM Connection Strings

A War Foretold

Recontextualizing Famous Quotes for Brand Slogan Generation

Poland Plans Social Media Ban for Kids in Challenge to US Tech

ML on Apple ][+

Comments