frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: I ran a language model on a PS2

https://github.com/xaskasdf/ps2-llm
22•xaskasdf•3d ago
The Emotion Engine has 32 MB of RAM total, so the trick is streaming weights from CD-ROM one matrix at a time during the forward pass — only activations, KV cache and embeddings live in RAM. This means models bigger than the RAM can still run, they just read more from disc.

Had to build a custom quantized format (PSNT), hack endianness, write a tokenizer pipeline, and most of the PS2 SDK from scratch (releasing that separately). The model itself is also custom — a 10M param Llama-style architecture I trained specifically for this.

And it works. On real hardware.

Comments

SilentEditor•3d ago
Love this project. The CD streaming trick is such a smart constraint hack, and honestly the best part is you trained the model for the hardware instead of forcing a desktop recipe onto PS2.

Curious about 2 things if you can share:

whats your per-token latency on real hardware how much quality loss came from PSNT quantization vs fp16 baseline Either way this is peak hacker energy, shipping on actual hardware makes it 10x cooler.

xaskasdf•3d ago
It didn't had any quality loss, since the PSNT as quantization it's mainly to convert the model over the console constraints (you can convert any model you want, even when i trained a model for this hw); it's q8 quantization, so quality loss is negligible for these sizes. For the speed, I will fix the Tok/sec count since now drops 0 always for showing measures

PS: Thank you! And forgot to mention PSNT also supports bitnet models, they work like crap tho

SilentEditor•1d ago
Thats super helpful, thanks for the details. Makes sense now that PSNT is more of a transport/runtime format for the PS2 constraints than a quality hack.

Very cool that it supports bitnet too even if results are rough right now, feels like theres a lot of room to tune there over time. when you do fix tok/sec, are you planning to post per-stage timings too (tokenizer, weight stream, matmul, samppling)? would be awesome to see where the biggest bottleneck is on real hw

mememememememo•2d ago
How many tok/hr?
SachitRafa•2d ago
The CD-ROM streaming approach is the real insight here, keeping only activations and KV cache in RAM and streaming weights one matrix at a time sidesteps the 32MB constraint entirely. It's essentially the same trick modern edge inference does with flash storage, just on hardware from 2000. Curious about the latency profile, with CD-ROM read speeds around 1.6 MB/s on PS2, the 77MB SmolLM2 model being too slow makes sense, but how does the 10MB brandon-tiny feel in practice? Are you getting tokens per minute or more like tokens per several seconds? Also interested in the custom PSNT format decision, was the main motivation the PS2's MIPS alignment constraints, or was there something about the existing GGUF/llama.c formats that made them impractical to parse on the Emotion Engine?
randkyp•1h ago
Neat! While the physicality of having the CD spin while running inference is undeniably cool, I wonder if you could run larger models at higher speeds through the PS2 HDD accessory/Memory Card Micro SD adapter/the PS2's USB port.

I doubt the VUs can help with inference given their small scratchpad sizes and instruction set though, haha.

pooparse•55m ago
IIRC the EE had some interesting hardware with vector units. Were these of any use/benefit here?

In Edison’s Revenge, Data Centers Are Transitioning From AC to DC

https://spectrum.ieee.org/data-center-dc
41•jnord•1h ago•19 comments

Goodbye to Sora

https://twitter.com/soraofficialapp/status/2036532795984715896
372•mikeocool•5h ago•297 comments

Flighty Airports

https://flighty.com/airports
57•skogstokig•1h ago•15 comments

I wanted to build vertical SaaS for pest control, so I took a technician job

https://www.onhand.pro/p/i-wanted-to-build-vertical-saas-for-pest-control-i-took-a-technician-job...
196•tezclarke•4h ago•76 comments

Show HN: I took back Video.js after 16 years and we rewrote it to be 88% smaller

https://videojs.org/blog/videojs-v10-beta-hello-world-again
205•Heff•7h ago•30 comments

Apple Business

https://www.apple.com/newsroom/2026/03/introducing-apple-business-a-new-all-in-one-platform-for-b...
520•soheilpro•10h ago•317 comments

Arm AGI CPU

https://newsroom.arm.com/blog/introducing-arm-agi-cpu
280•RealityVoid•8h ago•218 comments

Tell HN: Litellm 1.82.7 and 1.82.8 on PyPI are compromised

https://github.com/BerriAI/litellm/issues/24512
488•dot_treo•13h ago•371 comments

A Compiler Writing Journey

https://github.com/DoctorWkt/acwj
28•ibobev•2h ago•0 comments

An Aural Companion for Decades, CBS News Radio Crackles to a Close

https://www.nytimes.com/2026/03/21/business/media/cbs-news-radio-appraisal.html
19•tintinnabula•3d ago•2 comments

Algorithm Visualizer

https://algorithm-visualizer.org/
20•vinhnx•3d ago•2 comments

Wine 11 rewrites how Linux runs Windows games at kernel with massive speed gains

https://www.xda-developers.com/wine-11-rewrites-linux-runs-windows-games-speed-gains/
677•felineflock•7h ago•242 comments

What happened to GEM?

https://dfarq.homeip.net/whatever-happened-to-gem/
43•naves•4d ago•19 comments

Show HN: Email.md – Markdown to responsive, email-safe HTML

https://www.emailmd.dev/
220•dancablam•9h ago•49 comments

Zero-Cost POSIX Compliance: Encoding the Socket State Machine in Lean's Types

https://ngrislain.github.io/blog/2026-3-25-zerocost-posix-compliance-encoding-the-socket-state-ma...
9•ngrislain•1h ago•0 comments

Hypura – A storage-tier-aware LLM inference scheduler for Apple Silicon

https://github.com/t8/hypura
192•tatef•9h ago•75 comments

Show HN: Gemini can now natively embed video, so I built sub-second video search

https://github.com/ssrajadh/sentrysearch
254•sohamrj•10h ago•70 comments

Hypothesis, Antithesis, synthesis

https://antithesis.com/blog/2026/hegel/
205•alpaylan•10h ago•81 comments

How the world’s first electric grid was built

https://worksinprogress.co/issue/how-the-worlds-first-electric-grid-was-built/
56•zdw•4d ago•16 comments

Missile defense is NP-complete

https://smu160.github.io/posts/missile-defense-is-np-complete/
266•O3marchnative•12h ago•283 comments

Epic Games to cut more than 1k jobs as Fortnite usage falls

https://www.reuters.com/legal/litigation/epic-games-said-tuesday-that-it-will-lay-off-more-than-1...
264•doughnutstracks•10h ago•434 comments

Lago (YC S21) Is Hiring

https://getlago.notion.site/Lago-Product-Engineer-AI-Agents-for-Growth-327ef63110d280cdb030ccf429...
1•AnhTho_FR•8h ago

Is anybody else bored of talking about AI?

https://blog.jakesaunders.dev/is-anybody-else-bored-of-talking-about-ai/
518•jakelsaunders94•5h ago•363 comments

No Terms. No Conditions

https://notermsnoconditions.com
224•bayneri•9h ago•96 comments

Show HN: Gridland: make terminal apps that also run in the browser

https://www.gridland.io/
71•rothific•8h ago•8 comments

Data Manipulation in Clojure Compared to R and Python

https://codewithkira.com/2024-07-18-tablecloth-dplyr-pandas-polars.html
94•tosh•2d ago•23 comments

Epoch confirms GPT5.4 Pro solved a frontier math open problem

https://epoch.ai/frontiermath/open-problems/ramsey-hypergraphs
413•in-silico•1d ago•597 comments

Nanobrew: The fastest macOS package manager compatible with brew

https://nanobrew.trilok.ai/
181•syrusakbary•14h ago•109 comments

ARM AGI CPU: Specs and SKUs

https://sbcwiki.com/docs/soc-manufacturers/arm/arm-silicon/
94•HeyMeco•7h ago•25 comments

Show HN: ProofShot – Give AI coding agents eyes to verify the UI they build

https://github.com/AmElmo/proofshot
119•jberthom•18h ago•72 comments