Recycling SRAM when it becomes more precious seems like a more optimal strategy rather than having the precious SRAM sit idle on otherwise sleeping cores.
This doesn't really detract from the overall point - stacking a huge per-core L2 cache and using cross-chip reads to emulate L3 with clever saturation metrics and management is very different to what any x86 CPU I'm aware of has ever done, and I wouldn't be surprised if it works extremely well in practice. It's just that it'd have made a stronger article IMO if it had instead compared dedicated L2 + shared L2 (IBM) against dedicated L2 + shared L3 (intel), instead of dedicated L2 + sharded L3 (amd).
A key distinction, however, is latency. I don't know about Granite Rapids, but sources show that Sapphire Rapids had an L3 latency around 33 ns: https://www.tomshardware.com/news/5th-gen-emerald-rapids-cpu.... According to the article, the L2 latency in the Tellum II chips is just 3.8 ns (about 21 clock cycles at 5.5 GHz). Sapphire Rapids has an L2 latency of about 16 clock cycles.
IBM's cache architecture enables a different trade-off in terms of balancing the L2 versus L3. In Intel's architecture, the shared L3 is inclusive, so it has to be at least as big as L2 (and preferably, a lot bigger). That weighs in favor of making L2 smaller, so most of your on-chip cache is actually L3. But L3 is always going to have higher latency. IBM's design improves single-thread performance by allowing most of the on-chip cache to be lower-latency L2.
I would enjoy an ELI5 for the market differences between commodity chips and these mainframe grade CPUs. Stuff like design, process, and supply chain, anything of interest to a general (nerd) audience.
IBM sells 100s of Z mainframes per year, right? Each can have a bunch of CPUs, right? So Samsung is producing 1,000s of Telums per year? That seems incredible.
Given such low volumes, that's a lot more verification and validation, right?
Foundaries have to keep running to be viable, right? So does Samsung bang out all the Telums for a year in one burst, then switch to something else? Or do they keep producing a steady trickle?
Not that this info would change my daily work or life in any way. I'm just curious.
TIA.
In general scheduling of machine time and transportation of FOUPs (transport boxes for a number of wafers, the basic unit of processing) is a big topic in the industry, optimizing machine usage while keeping the number of unfinished wafers "in flight" and the overall cycle time low. It takes weeks for a wafer to flow through the fab.
I got so jealous of some colleagues, once even considered getting into mainframe work. CPU at 5.5 Ghz continuously, (not peak...) massive caches, really, really non stop...
Look at this tech porn: "IBM z17 Technical Introduction" - https://www.redbooks.ibm.com/redbooks/pdfs/sg248580.pdf
Cycle times are long enough so you can thoroughly refine your design.
Marketing pressures are probably extremely well thought, as anyone working on Mainframe marketing is probably either an ex-engineer or almost an engineer by osmosis.
And the product is different enough from anything else, that you can try novel ideas, but not different enough that your design skills are useless elsewhere, and you can't leverage other's advancement idea.
They have a famous networking (optical and wireline) group doing a lot of state-of-the-art research, and they deploy these in their mainframe products.
There's no other company like it. They are, in a sense, exact opposite of Apple where all HW engineers are pushed for impossible deadlines and solutions that save the day. Most in house developed IP is then not competitive enough and at the end doesn't end up in production (like their generations of 5G modems and the IPs in these like data converters etc).
I don’t think they neglect the migration of workloads to cloud platforms. Mainframes can only cost so much before it’s cheaper to migrate the workloads to other platforms and rewrite some SLAs. They did a ton of work on this generation to keep power envelope in the same ballpark of the previous generation, because that’s what their clients were concerned about.
exabrial•5h ago
rbanffy•5h ago
Running Linux, from a user's perspective, it feels just like a normal server with a fast CPU and extremely fast IO.
jiggawatts•4h ago
I wonder how big a competitive edge that will remain in an era where ordinary cloud VMs can do 10 GB/s to zone-redundant remote storage.
inkyoto•4h ago
Whilst cloud platforms are the new mainframe (so to speak), and they have all made great strides in improving the SLA guarantees, storage is still accessed over the network (plus extra moving parts – coordination, consistency etc). They will get there, though.
Cthulhu_•4h ago
RetroTechie•3h ago
Speed is not the only reason why some org/business would have Big Iron in their closet.
FuriouslyAdrift•3h ago
bob1029•5h ago
exabrial•4h ago
As one grey beard said it to me: Java is loosely typed and dynamic compared to colbol/db2/pl-sql. He was particularly annoyed that the smallest numerical type a ‘byte’ in Java was quote: “A waste of bits” and that Java was full of “useless bounds checking” both of which were causing “performance regressions”.
The way mainframe programs are written is: the entire thing is statically typed.
PaulHoule•4h ago
thechao•2h ago
uticus•2h ago
sounds like "don't try to out-optimize the compiler."
thechao•1h ago
dragontamer•1h ago
In the 90s, a cryptography paper was published that more quickly brute forced DES (standard encryption algorithm back then) using SIMD across 64-bit registers on a DEC Alpha.
There is also the 80s Connection Machine which was a 1-bit SIMD x 4096-lane supercomputer.
---------------
It sounds like this guy read a few 80s or 90s papers and then got stuck in that unusual style of programming. But there were famous programs back then that worked off of 1-bit SIMD x 64 lanes or x4096 lanes.
By the 00s, computers have already moved on to new patterns (and this obscure methodology was never mainstream). Still, I can imagine that if a student read a specific set of papers in those decades... This kind of mindset would get stuck.
rbanffy•35m ago
Not always, but they do have excellent performance analysis and will do what they can to avoid slow code in the hottest paths.
pjmlp•4h ago
Other than that, there are Java, C, C++ implementations for mainframes, for a while IBM even had a JVM implementation for IBM i (AS/400), that would translate JVM bytecodes into IBM i ones.
Additionally all of them have POSIX environments, think WSL like but for mainframes, here anything that goes into AIX, or a few selected enterprise distros like Red-Hat and SuSE.
BugheadTorpeda6•2h ago
Additionally, the Unix on z/OS is not like WSL. There is no virtualization. The POSIX APIs are implemented as privileged system services with Program Calls (kind of like supervisor calls/system calls). It's more akin to a "flavor" kinda like old school Windows and OS/2 than the modern WSL. You can interface with the system in the old school MVS flavor and use those APIs, or use the POSIX APIs, and they are meant to work together (for instance, the TCPIP stack and web servers on the platform are implemented with the POSIX APIs, for obvious compatibility and porting reasons).
Of course, you can run Linux on mainframes and that is big too, but usually when people refer to mainframe Unix they are talking about how z/OS is technically a Unix, which I don't think it would count in the same way if it was just running a Unix environment under a virtualization layer. Windows can do that and it's not a Unix.
BugheadTorpeda6•2h ago
For applications, COBOL is king, closely followed by Java for stuff that needs web interfaces. For middleware and systems and utilities etc, assembly, C, C++, REXX, Shell, and probably there is still some PL/X going on too but I'm not sure. You'd have to ask somebody working on the products (like Db2) that famously used PL/X. I'm pretty sure a public compiler was never released for PL/X so only IBM and possibly Broadcom have access to use it.
COBOL is best thought of as a domain specific language. It's great at what it does, but the use cases are limited you would be crazy to write an OS in it.
fneddy•1h ago
fneddy•47m ago