frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

EVs Are a Failed Experiment

https://spectator.org/evs-are-a-failed-experiment/
1•ArtemZ•6m ago•1 comments

MemAlign: Building Better LLM Judges from Human Feedback with Scalable Memory

https://www.databricks.com/blog/memalign-building-better-llm-judges-human-feedback-scalable-memory
1•superchink•7m ago•0 comments

CCC (Claude's C Compiler) on Compiler Explorer

https://godbolt.org/z/asjc13sa6
1•LiamPowell•9m ago•0 comments

Homeland Security Spying on Reddit Users

https://www.kenklippenstein.com/p/homeland-security-spies-on-reddit
2•duxup•11m ago•0 comments

Actors with Tokio (2021)

https://ryhl.io/blog/actors-with-tokio/
1•vinhnx•13m ago•0 comments

Can graph neural networks for biology realistically run on edge devices?

https://doi.org/10.21203/rs.3.rs-8645211/v1
1•swapinvidya•25m ago•1 comments

Deeper into the shareing of one air conditioner for 2 rooms

1•ozzysnaps•27m ago•0 comments

Weatherman introduces fruit-based authentication system to combat deep fakes

https://www.youtube.com/watch?v=5HVbZwJ9gPE
2•savrajsingh•28m ago•0 comments

Why Embedded Models Must Hallucinate: A Boundary Theory (RCC)

http://www.effacermonexistence.com/rcc-hn-1-1
1•formerOpenAI•29m ago•2 comments

A Curated List of ML System Design Case Studies

https://github.com/Engineer1999/A-Curated-List-of-ML-System-Design-Case-Studies
3•tejonutella•33m ago•0 comments

Pony Alpha: New free 200K context model for coding, reasoning and roleplay

https://ponyalpha.pro
1•qzcanoe•38m ago•1 comments

Show HN: Tunbot – Discord bot for temporary Cloudflare tunnels behind CGNAT

https://github.com/Goofygiraffe06/tunbot
1•g1raffe•40m ago•0 comments

Open Problems in Mechanistic Interpretability

https://arxiv.org/abs/2501.16496
2•vinhnx•46m ago•0 comments

Bye Bye Humanity: The Potential AMOC Collapse

https://thatjoescott.com/2026/02/03/bye-bye-humanity-the-potential-amoc-collapse/
2•rolph•50m ago•0 comments

Dexter: Claude-Code-Style Agent for Financial Statements and Valuation

https://github.com/virattt/dexter
1•Lwrless•52m ago•0 comments

Digital Iris [video]

https://www.youtube.com/watch?v=Kg_2MAgS_pE
1•vermilingua•57m ago•0 comments

Essential CDN: The CDN that lets you do more than JavaScript

https://essentialcdn.fluidity.workers.dev/
1•telui•58m ago•1 comments

They Hijacked Our Tech [video]

https://www.youtube.com/watch?v=-nJM5HvnT5k
1•cedel2k1•1h ago•0 comments

Vouch

https://twitter.com/mitchellh/status/2020252149117313349
34•chwtutha•1h ago•5 comments

HRL Labs in Malibu laying off 1/3 of their workforce

https://www.dailynews.com/2026/02/06/hrl-labs-cuts-376-jobs-in-malibu-after-losing-government-work/
4•osnium123•1h ago•1 comments

Show HN: High-performance bidirectional list for React, React Native, and Vue

https://suhaotian.github.io/broad-infinite-list/
2•jeremy_su•1h ago•0 comments

Show HN: I built a Mac screen recorder Recap.Studio

https://recap.studio/
1•fx31xo•1h ago•1 comments

Ask HN: Codex 5.3 broke toolcalls? Opus 4.6 ignores instructions?

1•kachapopopow•1h ago•0 comments

Vectors and HNSW for Dummies

https://anvitra.ai/blog/vectors-and-hnsw/
1•melvinodsa•1h ago•0 comments

Sanskrit AI beats CleanRL SOTA by 125%

https://huggingface.co/ParamTatva/sanskrit-ppo-hopper-v5/blob/main/docs/blog.md
1•prabhatkr•1h ago•1 comments

'Washington Post' CEO resigns after going AWOL during job cuts

https://www.npr.org/2026/02/07/nx-s1-5705413/washington-post-ceo-resigns-will-lewis
4•thread_id•1h ago•1 comments

Claude Opus 4.6 Fast Mode: 2.5× faster, ~6× more expensive

https://twitter.com/claudeai/status/2020207322124132504
1•geeknews•1h ago•0 comments

TSMC to produce 3-nanometer chips in Japan

https://www3.nhk.or.jp/nhkworld/en/news/20260205_B4/
3•cwwc•1h ago•0 comments

Quantization-Aware Distillation

http://ternarysearch.blogspot.com/2026/02/quantization-aware-distillation.html
2•paladin314159•1h ago•0 comments

List of Musical Genres

https://en.wikipedia.org/wiki/List_of_music_genres_and_styles
1•omosubi•1h ago•0 comments
Open in hackernews

Nvidia DGX Spark and Apple Mac Studio = 4x Faster LLM Inference with EXO 1.0

https://blog.exolabs.net/nvidia-dgx-spark/
61•edelsohn•3mo ago

Comments

pram•3mo ago
Very cool, using the DGX like an “AI eGPU.” I wonder if this could also benefit stuff like Stable Diffusion/WAN etc?
alexandercheema•3mo ago
Yes, these models are mostly compute-bound so benefit even more from the compute on the DGX Spark.
dekhn•3mo ago
Are you using USB-C for networking between the Spark and the Mac?
pdpi•3mo ago
IP over thunderbolt is definitely a thing, don't know whether IP over USB is also a thing. USB4x2 or TB5 can do 80Gib/s symmetrical or 120+40 asymmetrical (and boy is this a poster child for the asymmetrical setup). The Mac definitely supports that fine, so, as long as the Spark plays nice, USB is actually a legitimately decent choice.
esseph•3mo ago
USB4 was based on Thunderbolt3

Yes, it's a thing that works.

mehdibl•3mo ago
The gain is only in prefill and if the task/output is complex the gain will be totally minor. So the numbers are quitly exagerated here based on a prompt that is taking less than 2s to decode. So I guess we are not here doing complex tasks with 100's or 1000 token output. For the cost of an M3 Ultra + DGX the gain seem minimal and most of all, exo didn't clarify the model used here and it's for sure not a dense model or an MoE with 1B or 2B experts otherwise the mac ultra too will suffer a lot and the layers will be bigger!
solarkraft•3mo ago
Anecdotally, even medium-sized prompts (a few thousand tokens) on pretty small models (8-2B) have resulted in extremely noticeable slowdowns (vast majority of total processing time) on my M1 Mac, leading me to appreciate the significance of the pre-fill step (and difficulty of processing large contexts locally).
adam_arthur•3mo ago
I'm confused by all the takes implying decode is more important than prefill.

There are an enormous number of use cases where the prompt is large and the expected output is small.

E.g. providing data for the LLM to analyze, after which it gives a simple yes/no Boolean response. Or selecting a single enum value from a set.

This pattern seems far more valuable in practice, than the common and lazy open ended chat style implementations (lazy from a product perspective).

Obviously decode will be important for code generation or search, but that's such a small set of possible applications, and you'll probably always do better being on the latest models in the cloud.

drodgers•3mo ago
This is really cool!

Now I'm trying to stop myself from finding an excuse to spend upwards of $30k on compute hardware...

tuananh•3mo ago
if you have $30k to spare, I'm sure there are better options
_ea1k•3mo ago
Yeah, a couple of RTX Pro 6000 cards would blow this away and still leave him with money to spare.
solarkraft•3mo ago
This is a wonderful explanation of the two phases! I appreciate the hardware concerns for both now.

Reading the article I wished for a device that just does both things well and on that topic it might be noteworthy that Apple's just-released M5 has approximately 3.5x-ed TTFT performance compared to M4, according to their claims!

daft_pink•3mo ago
It’s really sad that exo went private.
ethanpil•3mo ago
How do you know this happened? I thought it was an abandoned project until I saw this post. I've been diligently checking weekly for new releases but nothing for almost a year...
alexandercheema•3mo ago
Appreciate you checking back so often. We have some exciting plans. Keep checking and it won't be long before something pops up :)
storus•3mo ago
Wouldn't this restrict memory to 128GB, wasting M3 Ultra potential?
alexandercheema•3mo ago
Blog author here. Actually, no. The model can be streamed into the DGX Spark, so we can run prefill of models much larger than 128GB (e.g. DeepSeek R1) on the DGX Spark. This feature is coming to EXO 1.0 which will be open-sourced soonTM.
storus•3mo ago
Excellent! Good luck!
musicale•3mo ago
But you could also just get two DGX Spark and get 2 * 1.9x = 3.8x total throughput for two query streams.
rcarmo•3mo ago
This is very nicely done. I wonder what the values will look like a year from now with M5 Macs, though.