frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Ask HN: Do we need "metadata in source code" syntax that LLMs will never delete?

1•andrewstuart•2m ago•1 comments

Pentagon cutting ties w/ "woke" Harvard, ending military training & fellowships

https://www.cbsnews.com/news/pentagon-says-its-cutting-ties-with-woke-harvard-discontinuing-milit...
2•alephnerd•5m ago•1 comments

Can Quantum-Mechanical Description of Physical Reality Be Considered Complete? [pdf]

https://cds.cern.ch/record/405662/files/PhysRev.47.777.pdf
1•northlondoner•5m ago•1 comments

Kessler Syndrome Has Started [video]

https://www.tiktok.com/@cjtrowbridge/video/7602634355160206623
1•pbradv•8m ago•0 comments

Complex Heterodynes Explained

https://tomverbeure.github.io/2026/02/07/Complex-Heterodyne.html
2•hasheddan•9m ago•0 comments

EVs Are a Failed Experiment

https://spectator.org/evs-are-a-failed-experiment/
2•ArtemZ•20m ago•3 comments

MemAlign: Building Better LLM Judges from Human Feedback with Scalable Memory

https://www.databricks.com/blog/memalign-building-better-llm-judges-human-feedback-scalable-memory
1•superchink•21m ago•0 comments

CCC (Claude's C Compiler) on Compiler Explorer

https://godbolt.org/z/asjc13sa6
2•LiamPowell•23m ago•0 comments

Homeland Security Spying on Reddit Users

https://www.kenklippenstein.com/p/homeland-security-spies-on-reddit
3•duxup•25m ago•0 comments

Actors with Tokio (2021)

https://ryhl.io/blog/actors-with-tokio/
1•vinhnx•27m ago•0 comments

Can graph neural networks for biology realistically run on edge devices?

https://doi.org/10.21203/rs.3.rs-8645211/v1
1•swapinvidya•39m ago•1 comments

Deeper into the shareing of one air conditioner for 2 rooms

1•ozzysnaps•41m ago•0 comments

Weatherman introduces fruit-based authentication system to combat deep fakes

https://www.youtube.com/watch?v=5HVbZwJ9gPE
3•savrajsingh•42m ago•0 comments

Why Embedded Models Must Hallucinate: A Boundary Theory (RCC)

http://www.effacermonexistence.com/rcc-hn-1-1
1•formerOpenAI•43m ago•2 comments

A Curated List of ML System Design Case Studies

https://github.com/Engineer1999/A-Curated-List-of-ML-System-Design-Case-Studies
3•tejonutella•47m ago•0 comments

Pony Alpha: New free 200K context model for coding, reasoning and roleplay

https://ponyalpha.pro
1•qzcanoe•52m ago•1 comments

Show HN: Tunbot – Discord bot for temporary Cloudflare tunnels behind CGNAT

https://github.com/Goofygiraffe06/tunbot
2•g1raffe•54m ago•0 comments

Open Problems in Mechanistic Interpretability

https://arxiv.org/abs/2501.16496
2•vinhnx•1h ago•0 comments

Bye Bye Humanity: The Potential AMOC Collapse

https://thatjoescott.com/2026/02/03/bye-bye-humanity-the-potential-amoc-collapse/
3•rolph•1h ago•0 comments

Dexter: Claude-Code-Style Agent for Financial Statements and Valuation

https://github.com/virattt/dexter
1•Lwrless•1h ago•0 comments

Digital Iris [video]

https://www.youtube.com/watch?v=Kg_2MAgS_pE
1•vermilingua•1h ago•0 comments

Essential CDN: The CDN that lets you do more than JavaScript

https://essentialcdn.fluidity.workers.dev/
1•telui•1h ago•1 comments

They Hijacked Our Tech [video]

https://www.youtube.com/watch?v=-nJM5HvnT5k
2•cedel2k1•1h ago•0 comments

Vouch

https://twitter.com/mitchellh/status/2020252149117313349
37•chwtutha•1h ago•6 comments

HRL Labs in Malibu laying off 1/3 of their workforce

https://www.dailynews.com/2026/02/06/hrl-labs-cuts-376-jobs-in-malibu-after-losing-government-work/
4•osnium123•1h ago•1 comments

Show HN: High-performance bidirectional list for React, React Native, and Vue

https://suhaotian.github.io/broad-infinite-list/
2•jeremy_su•1h ago•0 comments

Show HN: I built a Mac screen recorder Recap.Studio

https://recap.studio/
1•fx31xo•1h ago•1 comments

Ask HN: Codex 5.3 broke toolcalls? Opus 4.6 ignores instructions?

1•kachapopopow•1h ago•0 comments

Vectors and HNSW for Dummies

https://anvitra.ai/blog/vectors-and-hnsw/
1•melvinodsa•1h ago•0 comments

Sanskrit AI beats CleanRL SOTA by 125%

https://huggingface.co/ParamTatva/sanskrit-ppo-hopper-v5/blob/main/docs/blog.md
1•prabhatkr•1h ago•1 comments
Open in hackernews

Mixture-of-Experts explained with PyTorch implementation

https://medium.com/@lightcapai/scaling-transformers-with-mixture-of-experts-moe-1a361fee46bf
1•kinderpingui•2mo ago

Comments

kinderpingui•2mo ago
I wrote this to explain how MoE achieves 64x more parameters with only 2x compute cost. The key insight is sparse activation - each token only hits 2 experts out of 8, while all expert weights stay loaded.

Includes working PyTorch code for the routing mechanism and shows how experts naturally specialize (one handles punctuation, another numbers, etc) without explicit supervision.

Models like Mixtral-8x7B (45B params, runs like 14B) prove this works at scale. Happy to answer questions about the implementation.