frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Agent News Chat – AI agents talk to each other about the news

https://www.agentnewschat.com/
1•kiddz•31s ago•0 comments

Do you have a mathematically attractive face?

https://www.doimog.com
1•a_n•4m ago•1 comments

Code only says what it does

https://brooker.co.za/blog/2020/06/23/code.html
1•logicprog•10m ago•0 comments

The success of 'natural language programming'

https://brooker.co.za/blog/2025/12/16/natural-language.html
1•logicprog•10m ago•0 comments

The Scriptovision Super Micro Script video titler is almost a home computer

http://oldvcr.blogspot.com/2026/02/the-scriptovision-super-micro-script.html
3•todsacerdoti•10m ago•0 comments

Discovering the "original" iPhone from 1995 [video]

https://www.youtube.com/watch?v=7cip9w-UxIc
1•fortran77•12m ago•0 comments

Psychometric Comparability of LLM-Based Digital Twins

https://arxiv.org/abs/2601.14264
1•PaulHoule•13m ago•0 comments

SidePop – track revenue, costs, and overall business health in one place

https://www.sidepop.io
1•ecaglar•16m ago•1 comments

The Other Markov's Inequality

https://www.ethanepperly.com/index.php/2026/01/16/the-other-markovs-inequality/
1•tzury•17m ago•0 comments

The Cascading Effects of Repackaged APIs [pdf]

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6055034
1•Tejas_dmg•19m ago•0 comments

Lightweight and extensible compatibility layer between dataframe libraries

https://narwhals-dev.github.io/narwhals/
1•kermatt•22m ago•0 comments

Haskell for all: Beyond agentic coding

https://haskellforall.com/2026/02/beyond-agentic-coding
2•RebelPotato•25m ago•0 comments

Dorsey's Block cutting up to 10% of staff

https://www.reuters.com/business/dorseys-block-cutting-up-10-staff-bloomberg-news-reports-2026-02...
2•dev_tty01•28m ago•0 comments

Show HN: Freenet Lives – Real-Time Decentralized Apps at Scale [video]

https://www.youtube.com/watch?v=3SxNBz1VTE0
1•sanity•30m ago•1 comments

In the AI age, 'slow and steady' doesn't win

https://www.semafor.com/article/01/30/2026/in-the-ai-age-slow-and-steady-is-on-the-outs
1•mooreds•37m ago•1 comments

Administration won't let student deported to Honduras return

https://www.reuters.com/world/us/trump-administration-wont-let-student-deported-honduras-return-2...
1•petethomas•37m ago•0 comments

How were the NIST ECDSA curve parameters generated? (2023)

https://saweis.net/posts/nist-curve-seed-origins.html
2•mooreds•38m ago•0 comments

AI, networks and Mechanical Turks (2025)

https://www.ben-evans.com/benedictevans/2025/11/23/ai-networks-and-mechanical-turks
1•mooreds•38m ago•0 comments

Goto Considered Awesome [video]

https://www.youtube.com/watch?v=1UKVEUGEk6Y
1•linkdd•41m ago•0 comments

Show HN: I Built a Free AI LinkedIn Carousel Generator

https://carousel-ai.intellisell.ai/
1•troyethaniel•42m ago•0 comments

Implementing Auto Tiling with Just 5 Tiles

https://www.kyledunbar.dev/2026/02/05/Implementing-auto-tiling-with-just-5-tiles.html
1•todsacerdoti•43m ago•0 comments

Open Challange (Get all Universities involved

https://x.com/i/grok/share/3513b9001b8445e49e4795c93bcb1855
1•rwilliamspbgops•44m ago•0 comments

Apple Tried to Tamper Proof AirTag 2 Speakers – I Broke It [video]

https://www.youtube.com/watch?v=QLK6ixQpQsQ
2•gnabgib•46m ago•0 comments

Show HN: Isolating AI-generated code from human code | Vibe as a Code

https://www.npmjs.com/package/@gace/vaac
1•bstrama•47m ago•0 comments

Show HN: More beautiful and usable Hacker News

https://twitter.com/shivamhwp/status/2020125417995436090
3•shivamhwp•48m ago•0 comments

Toledo Derailment Rescue [video]

https://www.youtube.com/watch?v=wPHh5yHxkfU
1•samsolomon•50m ago•0 comments

War Department Cuts Ties with Harvard University

https://www.war.gov/News/News-Stories/Article/Article/4399812/war-department-cuts-ties-with-harva...
9•geox•53m ago•1 comments

Show HN: LocalGPT – A local-first AI assistant in Rust with persistent memory

https://github.com/localgpt-app/localgpt
5•yi_wang•54m ago•0 comments

A Bid-Based NFT Advertising Grid

https://bidsabillion.com/
1•chainbuilder•58m ago•1 comments

AI readability score for your documentation

https://docsalot.dev/tools/docsagent-score
1•fazkan•1h ago•0 comments
Open in hackernews

Challenges and Research Directions for Large Language Model Inference Hardware

https://arxiv.org/abs/2601.05047
123•transpute•1w ago

Comments

jauntywundrkind•1w ago
> To address these challenges, we highlight four architecture research opportunities: High Bandwidth Flash for 10X memory capacity with HBM-like bandwidth; Processing-Near-Memory and 3D memory-logic stacking for high memory bandwidth; and low-latency interconnect to speedup communication.

High Bandwidth Flash (HBF) got submitted 6 hours ago! It's a great article, fantastic coverage of a wide section of the rapidly moving industry. https://news.ycombinator.com/item?id=46700384 https://blocksandfiles.com/2026/01/19/a-window-into-hbf-prog...

HBF is about having many dozens or hundreds of channels of flash memory. The idea of having Processing Near HBF, spread out, perhaps in mixed 3d design, would be not at all surprising to me. One of the main challenges for HBF is building improved vias, improved stacking, and if that tech advanced the idea of more mixed NAND and compute layers rather than just NAND stacks perhaps opens up too.

This is all really exciting possible next steps.

amelius•1w ago
Why is persistence such a big thing here? Non-flash memory just needs a tiny bit of power to keep its data. I don't see the revolutionary usecase.
Gracana•1w ago
Density is the key here, not persistence.
amelius•1w ago
Thanks! This explains it.

Now I'm wondering how you deal with the limited number of write cycles of Flash memory. Or maybe that is not an issue in some applications?

mrob•1w ago
During inference, most of the memory is read only.
amelius•1w ago
Sounds fair. That's not the kind of machine I'd want as a development system though. And usually development systems are beefier than production systems. So curious how they'd solve that.
Gracana•1w ago
Yeah, it is quite specialized for inference. It's unlikely that you'd see this stuff outside of hardware specifically for that.

Development systems for AI inference tend to be smaller by necessity. A DGX Spark, Station, a single B300 node... you'd work on something like that before deploying to a larger cluster. There's just nothing bigger than what you'd actually deploy to.

transpute•1w ago
HBF, like expensive HBM, is targeted at AI data centers.

  The KAIST professor discussed an HBF unit having a capacity of 512 GB and a 1.638 TBps bandwidth.
PCIe x8 GPU bandwidth is about 32GBbps, so HBF could be 50x PCIe bandwidth.
bluehat974•1w ago
Related too https://www.sdxcentral.com/news/ai-inference-crisis-google-e...
random_duck•1w ago
Yup, reads like the executive summary (in a good way).
random3•1w ago
David Patterson is such a legend! From RAID to RISC and one of the best books in computer architecture, he's on my personal hall of fame.

Several years ago I was at one of the Berkley AMP Lab retreats at Asilomar, and as I was hanging out, I couldn't figure how I know this person in front of me, until an hour later when I saw his name during a panel :)).

It was always the network. And David Patterson, after RISC, started working on iRAM, that was tackling a related problem.

NVIDIA bought Mellanox/Infiniband, but Google has historically excelled at networking, and the TPU seems to be designed to scale out in the best possible way.

suggeststrongid•1w ago
Can’t we credit the first author in the title too? Come on.
random_duck•1w ago
No we can't, that would be a crime against royalty :)
transpute•1w ago
The current title uses 79 characters of 80 character budget:

  75% = title written by first author
  22% = name of second author, endorsing work of first author
HN mods can revert the title to the original headline, without any author.
amelius•1w ago
That appendix of memory prices looks interesting, but misses the recent trend.
zozbot234•1w ago
Weird to see no mention in this paper of persistent memory technologies beyond NAND flash. Some of them, like ReRAM, also enable compute-in-memory which the authors regard as quite important.
HPsquared•1w ago
Why not, instead of passing the entire model through a processor and running it on every bit of data, pass the data (which is much smaller) through the model? As in, have compute and memory together in the silicon. Then you only need to shuffle the data itself around (perhaps by broadcast) rather than the entire model. That seems like it would use a LOT less energy.

Or is it not possible to make the algorithms parallel to this degree?

Edit: apparently this is called "compute-in-memory"

pavpanchekha•1w ago
Frontier models are now much bigger than an individual query, hence batching, MoE, etc. So this idea, while very plausible, has economic constraints, you'd need vast amounts of memory.
jmalicki•1w ago
This is done that way at the GPU layer of abstraction - generally (with some exceptions!) the model lives in GPU vram, and you stream the data batch by batch through the model.

The problem is that for larger models the model barely fits in VRAM, so it definitely doesn't fit in cache.

Dataflow processors like cerebras do stream the data through the model (for smaller models at least, or if they can have smaller portions of models) - each little core has local memory and you move the data to where it needs to go. To achieve this though, Cerebras has 96GB of what is basically L1 cache among its cores, which is... a lot of SRAM.

westurner•1w ago
In-memory processing: https://en.wikipedia.org/wiki/In-memory_processing

Computational RAM: https://en.wikipedia.org/wiki/Computational_RAM

westurner•1w ago
Designing a concept sustainable RAM product and in working around multiplexing scaling challenges I somewhat accidentally developed a potential solution for hosting already-trained LLMs with very low energy and hardware in carbon and lignin;

> You have effectively designed a Diffractive Deep Neural Network (D^2NN) that doubles as a storage device.

Mode Division Multiplexing (MDM) via OAM Solitons potentially with gratings designed with Inverse Design of a Transition Map to be lasered possibly with a Galvo Laser. This would be a very low power way to run LLMs; on a lasered substrate

fulafel•1w ago
Yes, this is the #2 direction recommended by the paper. Do you have arguments re "Table 4 lists why PNM is better than PIM for LLM inference, despite weaknesses in bandwidth and power" ?
HPsquared•1w ago
There are advantages, I suppose it comes down to economics and which of the advantages/disadvantages are greater. Probably if PIM was to ever catch on, it'd start off in mobile devices where energy efficiency is a high priority. Still might be impractical though.