Other interesting notes: the invention of telegraphy and improvements to the underlying electrical systems really helped me understand communications in the 1800s better. And reading/watching Cuckoo's Egg (with the german relay-based telephones) made me appreciate modern digital transistor-based systems.
Even today, when I work on electrical projects in my garage, I am absolutely blown away with how much people could do with limited understanding and technology 100+ years ago compared to what I'm able to cobble together. I know Newton said he saw farther by standing on the shoulders of giants, but some days I feel like I'm standing on a giant, looking backwards and thinking "I am not worthy".
The history of automatic telephony in the Bell System is roughly:
- Step by step switches. 1920s Very reliable in terms of failure, but about 1% misdirected or failed calls. Totally distributed. You could remove any switch, and all it would do is reduce the capacity of the system slightly. Too much hardware per line.
- Panel. 1930s. Scaled better, to large-city central offices. Less hardware per line. Beginnings of common control. Too complex mechanically. Lots of driveshafts, motors, and clutches.
- Crossbar. 1940s. #5 crossbar was a big dumb switch fabric controlled by a distributed set of microservices, all built from relays. Most elegant architecture. All reliable wire relays, no more motors and gears. If you have to design high-reliability systems, is worth knowing how #5 crossbar worked.
- 1ESS - first US electronic switching. 1960s Two mainframe computers (one spare) controlling a big dumb switch fabric. Worked, but clunky.
- 5ESS - good US electronic switching. Two or more minicomputers controlling a big dumb switch fabric. Very good.
The Museum of Communications in Seattle has step by step, panel, and crossbar systems all working and interconnected.
In the entire history of electromechanical switching in the Bell System, no central office was ever fully down for more than 30 minutes for any reason other than a natural disaster, and in one case a fire in the cable plant. That record has not been maintained in the computer era. It is worth understanding why.
[1] https://archive.org/details/bellsystem_HistoryOfEngineeringA...
Go on.
The big dumb switch fabric of #5 Crossbar has no processing power at all, but it has persistent state. The units that have processing power all go down to their ground state at the end of each call processing event, and have no state that persists over transactions. The various processing units (markers, junctors, senders, originating registers, etc.) are all at least duplicated, and usually there's a pool of them. Requests "seize" a unit at random from a pool, the unit does its thing, and the unit is quickly released.
Units have self-checking, and if they fail, they drop out of their pool and raise an alarm. The call capacity or connection speed of the exchange is reduced but it keeps working. Everything has short hardware stall timers which will prevent some unit failure from hanging the exchange.
#5 Crossbar has almost no persistent memory. End offices (for connecting subscriber lines) did not log call info. Toll offices did, but that used an output-only paper tape punch. There's so little state in the switch that matching up call start and call end events was done later in a billing office where the paper tape was read.
The combination of statelessness and resource pools prevented total failure. Errors and unit failures happened occasionally but could not take down the whole switch.
There's plenty of info about #5 Crossbar on line, but 1950s telephony jargon is so different from 2020s server jargon that it's not obvious that #5 Crossbar is a microservices architecture.
Museum in seattle also has a working 3ESS (likely the only one left in the world), and have recently added a DMS-10 as well.
There are many articles missing a (2025) addition, so get to work!
Pretty impressive. It makes me sad that the trend is to move away from rock-solid stuff towards more and more unreliable and unpredictable stuff (e.g. LLMs that need constant human monitoring because they mess up so much).
I will also plug Connections Museum who have an already neat installation in Seattle and are working on their own 5ESS recovery for display at a new site in Colorado https://www.youtube.com/watch?v=3X3-xeuGI5o
A university bought a 5ESS in the 80s, ran it for ~35 years, did two major retrofits, and it just kept going. One physical system, understandable by humans with schematics, that degrades gracefully and can be literally moved with trucks and patience. The whole thing is engineered around physical constraints: -48V, cable management, alarm loops, test circuits, rings. You can walk it, trace it, power it.
Modern telco / "UC" is the opposite: logical sprawl over other people's hardware, opaque vendor blobs, licensing servers, soft switches that are really just big Java apps hoping the underlying cloud doesn't get "optimized" out from under them. When the vendor loses interest, the product dies no matter how many 9s it had last quarter.
The irony is that the 5ESS looks overbuilt until you realize its total lifecycle cost was probably lower than three generations of forklifted VoIP, PBX, and UC migrations, plus all the consulting. Bell Labs treated switching as a capital asset with a 30-year horizon. The industry now treats it as a revenue stream with a 3-year sales quota.
Preserving something like this isn't just nostalgia, it's preserving an existence proof: telephony at planetary scale was solved with understandable, serviceable systems that could run for decades. That design philosophy has mostly vanished from commercial practice, but it's still incredibly relevant if you care about building anything that's supposed to outlive the current funding cycle.
yborg•2mo ago
Aloha•2mo ago
There is a huge opportunity about 5 years from now for edge datacenters. You have these buildings which have highly reliable power and connectivity, all thats needed is servers which can live in a NEBS environment.
kayfox•2mo ago
bluedino•2mo ago
Aloha•2mo ago
Animats•2mo ago
bediger4000•2mo ago
ipdashc•2mo ago
But even the biggest IXP is surely tiny compared to the space required for an electromechanical exchange (that would host human operators as well). Are there just floors and floors of empty space? Like you said, on very expensive downtown real estate?
gjvc•2mo ago
kjs3•2mo ago
peterbecich•2mo ago
shrubble•2mo ago