frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Towards Memory Specialization: A Case for Long-Term and Short-Term RAM

https://arxiv.org/abs/2508.02992
19•PaulHoule•1h ago

Comments

Animats•58m ago
What they seem to want is fast-read, slow-write memory. "Primary applications include model weights in ML inference, code pages, hot instruction paths, and relatively static data pages". Is there device physics for cheaper, smaller fast-read slow write memory cells for that?

For "hot instruction paths", caching is already the answer. Not sure about locality of reference for model weights. Do LLMs blow the cache?

toast0•36m ago
Probably not what they want, but NOR flash is generally directly addressable, it's commonly used to replace mask roms.
bobmcnamara•28m ago
NOR is usually limited to <30MHz, but if you always want to fetch an entire cacheline, and design the read port, you can fetch the entire cacheline at once so that's pretty neat.

I don't know if anyone has applied this to neutral networks.

bobmcnamara•33m ago
> Do LLMs blow the cache?

Sometimes very yes?

If you've got 1GB of weights, those are coming through the caches on their way to execution unit somehow.

Many caches are smart enough to recognize these accesses as a strided, streaming, heavily prefetchable, evictable read, and optimize for that.

Many models are now quantized too to reduce the overall the overall memory bandwidth needed for execution, which also helps with caching.

photochemsyn•13m ago
Yes, this from the paper:

> "The key insight motivating LtRAM is that long data lifetimes and read heavy access patterns allow optimizations that are unsuitable for general purpose memories. Primary applications include model weights in ML inference, code pages, hot instruction paths, and relatively static data pages—workloads that can tolerate higher write costs in exchange for lower read energy and improved cost per bit. This specialization addresses fundamental mismatches in current systems where read intensive data competes for the same resources as frequently modified data."

Essentially I guess they're calling for more specific hardware for LLM tasks, much like was done with all the networking equipment for dedicated packet processing with specialized SRAM/DRAM/TCAM tiers to keep latency to a minimum.

While there's an obvious need for this for traffic flow across the internet, whether or not LLMs are really going to scale like that, or there's a massive AI/LLM bubble about to pop, would be the practical issue, and who knows? The tea leaves are unclear.

Grosvenor•24m ago
I'll put the Tandem 5 minute rule paper here, it seems very relevant.

https://dsf.berkeley.edu/cs286/papers/fiveminute-tr1986.pdf

and a revisit of the rule 20 years later (It still held).

https://cs-people.bu.edu/mathan/reading-groups/papers-classi...

staindk•14m ago
Sounds a bit like Intel's Optane which was seemed great in principle but I never had a use for it.

https://www.intel.com/content/www/us/en/products/details/mem...

https://en.wikipedia.org/wiki/3D_XPoint

dooglius•13m ago
I'm not seeing the case for adding this to general-purpose CPUs/software. Only a small portion of software is going to be able to be properly annotated to take advantage of this, so it'd be a pointless cost for the rest of users. Normally short-term access can easily become long-term in the tail the process gets preempted by something higher priority or spend a lot of time on an I/O operation. It's also not clear why if you had an efficient solution for the short-term case you wouldn't just add a refresh cycle and use it in place of normal SRAM as generic cache? These make a lot more sense in a dedicated hardware context -- like neural nets -- which I think is the authors' main target here.

Do conversations end when people want them to?

https://www.experimental-history.com/p/do-conversations-end-when-people
1•ntnbr•2m ago•0 comments

Rented Robots Get the Worst Jobs and Help Factories Keep the Humans

https://www.nytimes.com/2025/08/25/business/factories-robot-rentals.html
1•bookofjoe•5m ago•1 comments

Show HN: IndiePubStack – Open-source self-hosting Substack alternative for devs

https://github.com/IndiePubStack/IndiePubStack
1•andfadeev•7m ago•0 comments

ShaderGlass lets you run GPU shaders over any window

https://github.com/mausimus/ShaderGlass
2•gman83•9m ago•0 comments

Show HN: Codefmt – a fast Markdown code block formatter

https://github.com/1nwf/codefmt
1•1nwf•9m ago•0 comments

Ultima Underworld Retrospective – Forging a New Era [video]

https://www.youtube.com/watch?v=KegyZSGhVMg
2•CharlesW•9m ago•0 comments

Show HN: TypeScript boilerplate for scaling Claude Code beyond context limits

https://github.com/shinpr/ai-coding-project-boilerplate
1•shinpr•10m ago•0 comments

Show HN: I made a clothes-fitter with nano-banana

https://fitcheck.c3n.ro/
1•CodinM•16m ago•0 comments

extipy – Debug your Python script with a Jupyter notebook

https://github.com/ebanner/extipy
1•meken•21m ago•1 comments

What the 'Panama Playlists' Exposed About Spotify User Privacy

https://www.nytimes.com/2025/08/24/technology/spotify-panama-playlists-privacy.html
1•reaperducer•23m ago•1 comments

States target wealthy homeowners with new property taxes, sparking backlash

https://seekingalpha.com/news/4490703-states-target-wealthy-homeowners-with-new-property-taxes-sp...
8•donsupreme•25m ago•0 comments

Device Can Read the Pages of a Book Without Opening It

https://www.wbur.org/news/2016/09/13/read-a-book-without-opening-it
1•andrewstuart•28m ago•0 comments

Big Tech Power Rankings – September 1, 2025

https://www.powerrankings.tech/
2•meshugaas•28m ago•0 comments

Space investing goes mainstream as VCs ditch the rocket science requirements

https://techcrunch.com/2025/09/01/space-investing-goes-mainstream-as-vcs-ditch-the-rocket-science...
1•DocFeind•28m ago•0 comments

All you need is Make

https://github.com/avkcode/SRE/tree/main/vault
2•GitPopTarts•30m ago•0 comments

EU

1•klknazik•30m ago•1 comments

Show HN: Restaurant AI Host, Always On

https://heep.ai/
2•mathis-ve•31m ago•0 comments

Nimony: Design Principles

https://nim-lang.org/araq/nimony.html
1•Bogdanp•31m ago•0 comments

How OnlyFans Piracy Is Ruining the Internet for Everyone

https://www.404media.co/how-onlyfans-piracy-is-ruining-the-internet-for-everyone/
5•mikhael•31m ago•0 comments

A day in the life of a vibe coder

https://medium.com/@BryMei/a-day-in-the-life-of-a-vibe-coder-a335fbb7033b
1•bry_guy•33m ago•0 comments

Ask HN: Learning Code Is Dead?

2•demirbey05•35m ago•3 comments

Show HN: TurboTable – Your AI Knowledge Worker

https://turbotable.ai
1•murshudoff•37m ago•0 comments

Dragon Bravo megafire shows the growing wildfire threat to water systems

https://theconversation.com/grand-canyons-dragon-bravo-megafire-shows-the-growing-wildfire-threat...
3•PaulHoule•41m ago•0 comments

How to Exit Vim

https://github.com/hakluke/how-to-exit-vim
1•mirawelner•42m ago•1 comments

Phyphox: Physical Phone Experiments

https://phyphox.org/
1•thunderbong•45m ago•0 comments

Raspberry Pi 5 support (OpenBSD)

https://marc.info/?l=openbsd-cvs&m=175675287220070&w=2
10•brynet•45m ago•1 comments

Massimo

https://blog.platformatic.dev/introducing-massimo
1•feross•46m ago•0 comments

NativePHP – Build Native PHP Apps

https://nativephp.com
1•TheFreim•47m ago•0 comments

Functional Source License (FSL)

https://fsl.software/
2•jotaen•48m ago•0 comments

KPH Crypto Enigma CW Message 2025 [video]

https://www.youtube.com/watch?v=DJQZffcHbf8
1•austinallegro•57m ago•0 comments