But if you are looking at a hosted erlang VM for a capex of one dollar then these folks are onto something
Cores really are the only way to escape the broken moores law - and this does look like a real step in the important direction. Less LLMs more tiny cores
Erlang, at least the programming model, lends itself well to this, where each process has a local heap. If that can stay resident to a subsection of the CPU, that might lend itself better to a reasonably priced many core architecture.
That loosely describes plenty of multithreaded workloads, perhaps even most of them. A thread that doesn't keep its memory writes "local" to itself as much as possible will run into heavy contention with other threads and performance will suffer a lot. It's usual to try and write multithreaded workloads in a way that tries to minimize the chance of contention, even though this may not involve a literal "one local heap per core".
Manycores hasn't succeeded because frankly the programming model of essentially every other language is stuck in 1950. I, the program, am the entire and sole thing running on this computer, and must manually manage resources to match its capabilities. Hence async/await, mutable memory, race checkers, function coloring, all that nonsense. If half the effort spent straining to get the ghost PDP-11 ruling all the programming languages had been spent on cleaning up the (several) warts in the actor model and its few implementations, we'd all be driving Waymos on Jupiter by now.
[The obvious candidates from my point of view are (1) it's an abstract mathematical model with dispersed application/implementations, most of which introduce additional constraints (in other words, there is no central theory of the actor model implementation space), and (2) the message transport semantics are fixed: the model assumes eventual out-of-order delivery of an unbounded stream of messages. I think they should have enumerated the space of transport capabilities including ordered/unordered, reliable/unreliable within the core model. Treatment of bounded queuing in the core model would also be nice, but you can model that as an unreliable intermediate actor that drops messages or implements a backpressure handshake when the queue is full.]
With respect to your point (1), you might be interested in Pony, which has been discussed here from time to time, most recently: https://news.ycombinator.com/item?id=44719413 Of course there are other actor-based systems in wide use such as Akka.
Epyc has a mode where it does 4 numa nodes per socket, IIRC. It seems like that should be good if your software is NUMA aware or NUMA friendly.
But most of the desktop class hardware has all the cores sharing a single memory controller anyway, so if you had separate NUMA nodes, it wouldn't reflect reality.
Reducing cross core communication (NUMA or not) is the key to getting high performance parallelism. Erlang helps because any cross process communication is explicit, so there's no hidden communication as can sometimes happen in languages with shared memory between threads. (Yes, ets is shared, but it's also explicit communication in my book)
I tend to agree.
Where it gets -really- interesting to think about, are concepts like 'core parking' actors of a given type on specific cores; e.x. 'somebusinessprocess' actor code all happens on a specific fixed set of cores and 'account' actors run on a different fixed set of cores, versus having all the cores going back and forth between both.
Could theoretically get a benefit due to instruction cache being very consistent per core, giving benefits due to the mechanical sympathy (I think Disruptors also take advantage of this).
On the other hand, it may not be as big a benefit, in the sense that cross process writes are cross core writes and those tend to lead to their own issues...
fun to think about.
(And that also includes hosting, egress, power, etc).
in practice you can't though
> “(And that also includes hosting, egress, power, etc).
But I will certainly try to leverage my telco-connection to get to play with more of their kit if I can.
Does 1 Animat convert to metric nitpicks?
You know you're successful once you're added to: https://www.theregister.com/Design/page/reg-standards-conver...
"The origin of queueing theory dates back to 1909, when Agner Krarup Erlang (1878–1929) published his fundamental paper on congestion in telephone traffic [for a brief account, see Saaty (1957), and for details on his life and work, see Brockmeyer et al. (1948)]." -- https://www.sciencedirect.com/topics/engineering/queueing-th...
(It's amazing how little logging went on in the phone system before computerized switching. But that's another subject.)
Just being able to star that many instances is not that exciting until we know what they can do.
However BEAM is not the only factor in this process. the entire hardware platform as well.
This is after all a lot about that nice and huge cpu.
I mean when you have all 5000 started why not let the do some work? Stress test it with a few real life scenarios for 48h and let us see some number.
elteto•6mo ago
bevr1337•6mo ago
hinkley•6mo ago
thechao•6mo ago
temp0826•6mo ago
antonvs•6mo ago
In other words, nepobaby fault tolerance
lawik•5mo ago