frontpage.

Subject-based weight routing for LLMs (27 days before DeepSeek Engram)

1•AutoJanitor•2w ago

I run LLM inference on an IBM POWER8 S824 with 576GB RAM – a $700 eBay server from 2014. In December 2025, I built "RAM Coffers" – banking model weights by subject domain with hot caching and resonance routing.

  On January 12, 2026, DeepSeek published "Engram" (arXiv:2601.07372) describing the same core idea: route queries to cached weight banks based on subject 
  matter.                                                                                                                                                  
                                                                                                                                                           
  The concepts are similar because I built it first. YouTube video from December 17, 2025: https://youtu.be/T_o39s7r0iE                                    
                                                                                                                                                           
  Terminal shows "RAM Coffers: ON | L2/L3 Resident: ON" – 26 days before their paper.                                                                      
                                                                                                                                                           
  Core shared concept: Query comes in → classify subject → route to relevant weight bank → hot cache keeps it fast                                         
                                                                                                                                                           
  What I added beyond the core:                                                                                                                            
  • NUMA topology – weights placed on specific memory nodes. Engram doesn't address hardware topology.                                                     
  • Neuromorphic mapping – brain regions to NUMA nodes                                                                                                     
  • Tetranary confidence – 4-state routing logic                                                                                                           
  • Vec_perm collapse – single-cycle attention on POWER8                                                                                                   
  • PowerLISP – LLMs that actually remember                                                                                                                
  • L2/L3 prefetch – 147 t/s vs 17 t/s stock (8.8x)                                                                                                        
                                                                                                                                                           
  DOIs:                                                                                                                                                    
  • RAM Coffers (Dec 16): doi.org/10.6084/m9.figshare.31093429                                                                                             
  • Neuromorphic: doi.org/10.5281/zenodo.18321905                                                                                                          
  • PowerLISP: doi.org/10.5281/zenodo.18322052                                                                                                             
                                                                                                                                                           
  GitHub: github.com/Scottcjn/ram-coffers

Slint: Cross Platform UI Library

AI and Education: Generative AI and the Future of Critical Thinking

Maple Mono: Smooth your coding flow

Moltbook isn't real but it can still hurt you

Take Back the Em Dash–and Your Voice

Show HN: 289x speedup over MLP using Spectral Graphs

Teaching Mathematics

3D Printed Microfluidic Multiplexing [video]

Abstractions Are in the Eye of the Beholder

Show HN: Routed Attention – 75-99% savings by routing between O(N) and O(N²)

We didn't ask for this internet – Ezra Klein show [video]

The Real AI Talent War Is for Plumbers and Electricians

Show HN: MimiClaw, OpenClaw(Clawdbot)on $5 Chips

I Maintain My Blog in the Age of Agents

The Fall of the Nerds

I'm 15 and built a free tool for reading Greek/Latin texts. Would love feedback

How close is AI to taking my job?

You are the reason I am not reviewing this PR

Show HN: FamilyMemories.video – Turn static old photos into 5s AI videos

How Meta Made Linux a Planet-Scale Load Balancer

A Turing Test for AI Coding

How to Identify and Eliminate Unused AWS Resources

A2CDVI – HDMI output from from the Apple IIc's digital video output connector

CLI for Common Playwright Actions

Would you use an e-commerce platform that shares transaction fees with users?

Show HN: SafeClaw – a way to manage multiple Claude Code instances in containers

The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+

The Evolution of the Interface

Azure: Virtual network routing appliance overview

Seedance2 – multi-shot AI video generation