frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

10,924x: The Instability Bomb at 1.7B Scale

https://taylorkolasinski.com/notes/mhc-reproduction-part2/
3•taykolasinski•2h ago

Comments

taykolasinski•2h ago
OP here. This is Part 2 of my reproduction series. I scaled the experiment from 10M params (MacBook) to 1.7B params (8x H100s) to test DeepSeek's instability claims.

The paper reported 3,000x signal amplification. I found 10,924x.

The "Instability Bomb" findings:

- The Scaling Law: It's strictly worse at scale. 10M → 9x, 1.7B → 10k x.

- The Culprit: It's Layer 0. The first mixing matrix eats raw embeddings without LayerNorm and immediately amplifies them.

- The Twist: Despite 10,000x amplification, the model didn't diverge. It kept learning, likely saved by gradient clipping.

I’ve posted the full logs and Amax graphs in the post. Happy to answer questions about the H100 cluster setup or the Sinkhorn projection math.

Show HN: I scrapped my working AI agent pipeline and rebuilt it (postmortem)

https://xenendev.github.io/2025/12/15/agentic-vs-procedural/
1•xvpdev•49s ago•0 comments

OpenAI Has Some Catching Up to Do

https://every.to/chain-of-thought/openai-has-some-catching-up-to-do
1•dshipper•6m ago•0 comments

Hobby Horsing

https://en.wikipedia.org/wiki/Hobby_horsing
1•mooreds•9m ago•0 comments

Illuminating Data: From Medieval Scriptoria to the Cyber-Saint

https://substack.com/inbox/post/184644294
1•dmazin•10m ago•0 comments

1Password's Enterprise Identity Transformation

https://softwareanalyst.substack.com/p/inside-1passwords-enterprise-identity
1•mooreds•11m ago•0 comments

IO buffering: Why Ruby logs behave differently in containers

https://zhisme.com/articles/ruby-io-buffering/
1•zhisme•11m ago•0 comments

EA Shader To Human – HLSL/GLSL library for debugging shaders

https://github.com/electronicarts/ShaderToHuman
1•bauc•11m ago•0 comments

Savewalterwhite.com – is it a real website?

https://redas.dev/blog/save-walter-white/
1•holoflash•11m ago•0 comments

Visualizing the full technology stack of an LLM query [video]

https://www.youtube.com/watch?v=nmBqcRl2tmM
1•prajwal299•11m ago•1 comments

How Universities Are Shutting Out Disabled Students and Staff

https://thewalrus.ca/how-universities-are-shutting-out-disabled-students-and-staff/
2•speckx•12m ago•0 comments

Show HN: Fluent, a tiny lang for differentiable tensors and reactive programming

https://github.com/mlajtos/fluent
1•mlajtos•13m ago•0 comments

Why LLMs Are Not (Yet) the Silver Bullet for Unstructured Data Processing

https://unstract.com/blog/why-llms-struggle-with-unstructured-data/
1•naren87•14m ago•0 comments

News Corp is rolling out AI in its newsroom

https://www.siliconsnark.com/ai-just-moved-into-the-newsroom-is-this-the-end-of-journalism-or-its...
1•SaaSasaurus•14m ago•0 comments

Life Under a Clicktatorship

https://donmoynihan.substack.com/p/life-under-a-clicktatorship
1•mooreds•15m ago•0 comments

Tighter bounds in the prime number theorem

https://www.johndcook.com/blog/2026/01/16/prime-number-theorem-bounds/
1•7777777phil•15m ago•0 comments

The AI boom is heralding a new gold rush in the American west

https://www.theguardian.com/technology/2025/dec/04/nevada-ai-data-centers
1•vednig•16m ago•0 comments

Innova-2 Flex XCKU15P Setup and Usage Notes

https://github.com/mwrnd/innova2_flex_xcku15p_notes
1•Fnoord•17m ago•0 comments

Partly AI-generated folk-pop hit barred from Sweden's official charts

https://www.theguardian.com/technology/2026/jan/16/partly-ai-generated-folk-pop-hit-barred-from-s...
2•leonidasv•17m ago•0 comments

Spotify increases its US subscription prices for the third time in 3 years

https://sherwood.news/markets/spotify-increases-its-us-subscription-prices-for-the-third-time-in-...
3•avonmach•17m ago•1 comments

All agents will becoming coding agents

https://davistreybig.substack.com/p/all-agents-will-become-coding-agents
1•davistreybig•17m ago•0 comments

YouTube is adding over 100 classic Sesame Street episodes

https://9to5google.com/2026/01/15/youtube-adds-over-100-sesame-street-episodes/
1•thunderbong•18m ago•0 comments

STFU

https://github.com/Pankajtanwarbanna/stfu
2•tanelpoder•19m ago•0 comments

SelfCI – a minimalistic local-first Unix-philosophy-abiding CI

https://app.radicle.xyz/nodes/radicle.dpc.pw/rad%3Az2tDzYbAXxTQEKTGFVwiJPajkbeDU
1•birdculture•21m ago•1 comments

Show HN: Map of California SNO-Parks with current snow depth data

https://cloudkj.github.io/snowpack/examples/ca_sno_parks/
1•cloudkj•21m ago•0 comments

Semantic highlight model to cut token cost for RAG

https://huggingface.co/blog/zilliz/zilliz-semantic-highlight-model
2•codingjaguar•24m ago•0 comments

Claude Code for Product Managers

https://ccforpms.com/
1•riknos314•24m ago•0 comments

How WhatsApp Took over the Global Conversation

https://www.newyorker.com/magazine/2026/01/19/how-whatsapp-took-over-the-global-conversation
1•brightbeige•25m ago•0 comments

Painted Halafian Pottery of Mesopotamia and Prehistoric Mathematical Thinking

https://link.springer.com/article/10.1007/s10963-025-09200-9
1•Schiphol•25m ago•0 comments

Disproof of Large Language Model Consciousness

https://web3.arxiv.org/pdf/2512.12802
3•jbotz•26m ago•0 comments

American Invasion of Greenland (2029)

https://falloutfanfic.fandom.com/wiki/American_Invasion_of_Greenland
1•thastings•28m ago•1 comments