frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

GLM 5.2 beats Claude in our benchmarks

https://semgrep.dev/blog/2026/we-have-mythos-at-home-glm-52-beats-claude-in-our-cyber-benchmarks/
519•jms703•9h ago•251 comments

Knowledge Distillation of Black-Box Large Language Models (2024)

https://arxiv.org/abs/2401.07013
62•babelfish•4h ago•13 comments

Historical memory prices 1960-2026

https://dam.stanford.edu/memory-prices.html
196•vga1•8h ago•76 comments

Better Images of AI

https://betterimagesofai.org/
22•Curiositry•3h ago•12 comments

5k menus from the New York Public Library’s Buttolph Collection (1880-1920)

https://pudding.cool/2026/06/menu-story/
335•xbryanx•12h ago•87 comments

I used Claude Code to get a second opinion on my MRI

https://antoine.fi/mri-analysis-using-claude-code-opus
361•engmarketer•10h ago•473 comments

AI boom risks global financial crash, warn central bankers

https://www.telegraph.co.uk/business/2026/06/28/ai-boom-risks-global-financial-crash-central-bank...
59•b-man•1h ago•36 comments

Deciphering Basmala

https://blog.plover.com/lang/bismillah.html
19•lordgrenville•4d ago•5 comments

TOP500 at ISC’26: We have a New Number 1 Supercomputer

https://chipsandcheese.com/p/top500-at-isc26-we-have-a-new-number
83•rbanffy•7h ago•39 comments

Tell Congress: Don't Force Age Checks Online

https://act.eff.org/action/tell-congress-don-t-force-age-checks-online
21•rmason•1h ago•1 comments

The Boeing 747 begins its final descent

https://www.theatlantic.com/magazine/2026/07/boeing-747-retirement/687304/
148•dbl000•3d ago•205 comments

Show HN: Zanagrams

https://zanagrams.com/
199•pompomsheep•11h ago•53 comments

Librepods: AirPods liberated

https://github.com/librepods-org/librepods
297•rbanffy•8h ago•94 comments

Professor denounces mass AI fraud on an exam at Brown

https://english.elpais.com/education/2026-06-28/ai-fraud-at-brown-university-academic-integrity-i...
299•geox•10h ago•403 comments

Working around dragons with the Lemote Yeeloong laptop and OpenBSD

http://oldvcr.blogspot.com/2026/06/working-around-dragons-with-lemote.html
94•zdw•10h ago•22 comments

Daisugi, the Japanese technique of growing trees out of other trees (2020)

https://www.openculture.com/2020/10/daisugi.html
112•MaysonL•10h ago•36 comments

Tokenmaxxing is dead, long live tokenmaxxing

https://12gramsofcarbon.com/p/agentics-tech-things-tokenmaxxing
117•theahura•10h ago•143 comments

Researchers have developed pixels that can emit and analyse light together

https://ethz.ch/en/news-and-events/eth-news/news/2026/06/a-new-type-of-pixel.html
48•tspng•1d ago•32 comments

The Baffling World of Masayoshi Son's Presentations (2020)

https://www.bloomberg.com/news/features/2020-06-23/golden-geese-and-unicorns-inside-the-eccentric...
20•phaser•2d ago•4 comments

Show HN: DRM-Free Books

https://frequal.com/Perspectives/DrmFreeAuthors.html
77•TeaVMFan•10h ago•34 comments

Model Training as Code

https://aleph-alpha.com/en/blog/model-training-as-code/
22•peterBlue75•3d ago•9 comments

Xonaly – Canada's Independent Search Engine

https://xonaly.com/
61•backlit4034•4h ago•40 comments

A way to exclude sensitive files issue still open for OpenAI Codex

https://github.com/openai/codex/issues/2847
180•pikseladam•14h ago•121 comments

The KIDS Act would require age checks to get online

https://www.eff.org/deeplinks/2026/06/kids-act-would-require-age-checks-get-online
335•bilsbie•15h ago•280 comments

Show HN: Bash4LLM+ – A lightweight, dependency-free Bash wrapper for LLM APIs

https://github.com/kamaludu/bash4llm/
37•kamaludu•7h ago•15 comments

Examining circuit boards from the Space Shuttle's I/O Processor

https://www.righto.com/2026/06/space-shuttle-io-processor-boards.html
86•pwg•10h ago•20 comments

Show HN: NanoEuler – GPT-2 scale model in pure C/CUDA from scratch

https://github.com/JustVugg/nanoeuler
38•vforno•7h ago•9 comments

The curious case of the disappearing Polish S (2015)

https://aresluna.org/the-curious-case-of-the-disappearing-polish-s/
210•colinprince•14h ago•71 comments

The MUMPS 76 Primer – anniversary edition

https://github.com/rochus-keller/MUMPS/blob/main/docs/MUMPS_Primer.adoc
72•Rochus•14h ago•42 comments

British Origami: the 1955 exhibition by Akira Yoshizawa (2005)

https://www.britishorigami.org/cp-lister-list/the-1955-exhibition-by-akira-yoshizawa/
22•dang•8h ago•2 comments
Open in hackernews

Sophon PFG-1: a monolithic-3D AI ASIC with 330 GB of on-die DRAM and no HBM

https://www.phantafield.com/whitepaper
24•minkowsky•1h ago

Comments

codingpanic•1h ago
I've been wondering how long before RAM is fabbed on die to get around supply issues. This is one of the first I've read of so far. How long before Apple releases a CPU with ram on die?
minkowsky•1h ago
Author here. The supply angle is exactly the motivation — HBM is the hardest part to get and ~26% of an AI rack's BOM.

First, separate three things people lump together. Apple already does memory on package (M-series unified memory = LPDDR5X dies next to the SoC). The near-term industry path is bonded stacking (AMD 3D V-cache, HBM4's logic base die). What we're doing is monolithic — growing the memory on top of finished logic. Three reasons that distinction matters:

1. Bonding only helps at the margin. A hybrid-bond interface still carries a relatively large interconnect capacitance in um scale, so at memory bandwidth the I/O drivers crossing it dissipate most of the power and overheat — you move the memory closer without escaping the I/O energy. Monolithic inter-tier vias are nano-scale (we model ~1% the interconnect energy of a bonded interface), and that's the only thing that actually moves the needle.

2. 2D-TMDs are the only functional CMOS you can build in the BEOL. Monolithic 3D means fabricating the upper tiers after the logic, at ≤450 °C, or you cook everything underneath. Silicon needs ~1000 °C; low-temp oxide semiconductors (IGZO) are n-type only, so no real CMOS. 2D-TMDs give both n- and p-type at BEOL temperature. Nothing else does.

3. ~6 orders of magnitude lower off-current (~1 fA/µm) finally makes a capacitor-free cell work. Conventional 1T1C DRAM needs a big storage capacitor — the deep-trench / high-aspect-ratio etch you can't do in the BEOL anyway. A 2T0C gain cell holds charge on a transistor gate with no capacitor; in silicon it leaked away in microseconds, so it was never usable. With 2D-TMD leakage you get ~1.8 s retention — refresh at ~1 Hz and drop the capacitor, and the trench, entirely.

Rohansi•1h ago
They're typically manufactured with very different processes so one has to wonder what compromises are being made here to get both on the same die.
wmf•1h ago
This design is absolutely wild. It probably won't work but I admire the dream.
minkowsky•1h ago
Author here. The economy is more realistic than the wafer-scale ASIC by Cerebras.
binyu•1h ago
Hello, kudos for the tremendous work. Could you explain the difference between your design and Cerebras?

Bests

minkowsky•1h ago
Author here. Thanks! Short version: Cerebras and we are attacking the same memory wall from opposite axes — they scale out in 2D, we scale up in 3D.

Cerebras WSE-3 is a brilliant packaging play: one wafer-scale chip (~46,000 mm², ~900k cores) with ~44 GB of SRAM spread across the plane, so compute and memory sit side by side with enormous bandwidth. The catch is density — SRAM is a 6T cell, so even a whole wafer only holds ~44 GB. An 80B model doesn't fit on-wafer, so weights stream in from external MemoryX (off-wafer DRAM). It's fast, but it's a ~23 kW, multi-million-dollar system, and large models are still memory-streamed.

Sophon is a single ~750 mm² die. Instead of spreading SRAM across a wafer, we stack DRAM on top of the logic — 64 monolithic 3D tiers of 2D-TMD compute-in-memory and capacitor-less gain-cell DRAM. The gain cell is denser than SRAM per layer, and we stack 32 memory tiers of it, so we get 330 GB on one normal-size die — enough that an 80B model is fully resident, no streaming, no off-chip memory at all. ~1 kW, not 23 kW.

So the real difference is SRAM-in-2D vs DRAM-in-3D: Cerebras maxes out planar SRAM area; we trade to denser DRAM and stack it vertically, which is what buys GB-scale on-die capacity.

Honest caveat: Cerebras ships real silicon today and is genuinely fast — they proved wafer-scale integration works. We're pre-silicon, betting on a harder materials path (2D-TMD monolithic 3D). The upside, if it yields, is capacity-per-watt and per-dollar that planar SRAM can't reach.

addaon•1h ago
Since when are we doing 32-layer planar transistor logic on a single chip? Even ignore the use of FETs for eDRAM… I didn’t realize we had decent logic density possible on BEOL.
minkowsky•7m ago
Because we can put FET on any layer. Usually, BEOL doesn't need such high density. The density depends on what lithography tool and mask you pick.
brcmthrowaway•1h ago
What is this? AI generated company?
vessenes•1h ago
Minkowsky, cool design! Question - the ASIC designers I've worked with over the years have been fairly adamant that integrating memory on package interspersed with logic is very difficult; the general statements run like "those designs always look great on paper, but never tape out properly".

Have you done any hardware tests of this plan? Is this still considered quality advice?

Second q, why start with 28nm? Is the idea that you want to stick with TSMC and be able to shrink? If this does in fact work well, I can imagine wanting to shoot for a smaller process node pretty quickly. Is there some sort of tech / design gap you'll need to figure out as you go?

gfody•1h ago
isn't cerebras the pudding proof of this design? it seems like ai chips galore are appearing from the woodwork but cerebras is 10 years down this rabbit hole and poised to dominate
vessenes•14m ago
I believe cerebras is one wafer, not deeply stacked, each core is like half memory half compute by area.
minkowsky•37m ago
Due to the thermal budget, most of the silicon design is constrained to a 2D layout. So the Memory is competing with logic for layout. Now we stack logic in the backend between metals.

We fabricated 2T0C DRAM arrays with a 3D monolithic structure. That's a must-do.

Why 28nm? Because it's cheap, widely available, and already gives us enough performance to beat Nvidia Vera Rubin. We have a road map, scaling it down. https://www.phantafield.com/whitepaper#6-scaling-roadmap

RobLach•1h ago
MoS2 lattice construction?
matt123456789•45m ago
I suspect you are being downvoted because your answer is AI-generated, but I found it very clear and will upvote.
binyu•39m ago
What makes you think his reply was AI generated?

Edit: I can see a bunch of hints, most definitely. Still a good comment though.

minkowsky•11m ago
I do use AI for some of the answers. I now know the penalty. Thank you for the heads up.
binyu•45m ago
> they scale out in 2D, we scale up in 3D.

This actually helps a lot, thanks.

> Instead of spreading SRAM across a wafer, we stack DRAM on top of the logic

Is this done with current manufacturing technologies? Does it require a special process?

> no streaming, no off-chip memory at all. ~1 kW, not 23 kW

Is this for an individual compute unit? Compared to Cerebras, what's the ratio of power used vs compute output?

minkowsky•9m ago
I think you are asking for the Energy/token. Cerebras is 12.8J, Sophon is 25.8mJ. Three orders of difference.
JumpCrisscross•1h ago
Can you explain why?
minkowsky•1h ago
I have a detailed comparison with Cerebras in economic analysis: https://www.phantafield.com/whitepaper#7-economic-analysis
wmf•17m ago
I'm questioning technical risks such as BEOL transistors and 2T DRAM cell structure, not the economics. Cerebras has already retired their technical risk.