frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The pragmatic tradeoff of tied embeddings

https://blog.silennai.com/tied-embeddings
1•SilenN•1h ago

Comments

SilenN•1h ago
Simply, it's when your output embedding matrix = input.

You save vocab_dim*model_dim params (ex. 617m for GPT-3).

But the residual stream means that the weight matrices are roughly connected via a matmul, which means they struggle to encode bigrams (commutative property enforces symmetry).

Attention + MLP adds nonlinearity, but it still means less expressivity.

Which is why they aren't SOTA, but are useful in smaller models.

Vimeo starts layoffs after acquisition by Bending Spoons

https://techcrunch.com/2026/01/22/vimeo-starts-layoffs-after-acquisition-by-bending-spoons/
1•absqueued•2m ago•1 comments

The Tighter Weave: On Editing and Not Editing

https://hedgehogreview.com/issues/place-and-revolution/articles/the-tighter-weave
1•herbertl•3m ago•0 comments

The Work Behind the Writing: On Writers and Their Day Jobs

https://lithub.com/the-work-behind-the-writing-on-writers-and-their-day-jobs/
1•herbertl•4m ago•0 comments

AWS in 2026: The Year of Proving They Still Know How to Operate

https://www.lastweekinaws.com/blog/aws-in-2026-the-year-of-proving-they-still-know-how-to-operate/
1•mooreds•6m ago•0 comments

A Controversial Video-Game Designer Returns with a New Game and More Controversy

https://www.bloomberg.com/news/newsletters/2026-01-22/jonathan-blow-a-controversial-video-game-de...
2•DustinEchoes•8m ago•0 comments

Cost-Effectiveness of Language Models for Time Series Anomaly Detection

https://www.mdpi.com/2078-2489/17/1/72
2•PaulHoule•9m ago•1 comments

Why leaders often disappoint us

https://ariadne.space/2026/01/22/why-leaders-often-disappoint-us.html
2•milkglass•11m ago•0 comments

Computer vision papers reimplemented with minimal PyTorch code

https://github.com/MaximeVandegar/Papers-in-100-Lines-of-Code
1•maxvdg•13m ago•1 comments

Stochastic Mirrors rather than Stochastic Parrots?

1•EdNutting•14m ago•0 comments

I was detained at Davos for my hardware prototype; Swiss police audited the Rust

https://www.semafor.com/article/01/22/2026/an-entrepreneurs-13-hours-in-davos-jail-the-food-was-p...
3•reutinger•16m ago•1 comments

ICE detains five-year-old Minnesota boy arriving home, say school officials

https://www.theguardian.com/us-news/2026/jan/21/ice-arrests-five-year-old-boy-minnesota
15•0x54MUR41•18m ago•0 comments

Seeking Co-Founder for Declarative Application Editor

1•mwhite•23m ago•1 comments

Authorization Before Retrieval: Making RAG Safe by Construction

https://www.windley.com/archives/2026/01/authorization_before_retrieval_making_rag_safe_by_constr...
1•mooreds•31m ago•0 comments

Week 1: EE 292P Atoms, Bits, and the National Interest

https://hnvr.medium.com/week-1-ee-292p-atoms-bits-and-the-national-interest-the-technology-enviro...
1•malchow•32m ago•0 comments

Why doing a mix of exercise could be the key to longer life

https://www.bbc.com/news/articles/cn0y9pqe2zro
3•kareemm•33m ago•0 comments

Monitor Cron Jobs Without Migration – DeadManPing

https://www.deadmanping.com/blog/monitor-cron-jobs
1•BlackPearl02•35m ago•0 comments

Starting a Startup at 25, 35, or 45 Is Not the Same Decision

4•alx_sukhanov•45m ago•1 comments

We spent 5 YEARS building New York City in Minecraft [video]

https://www.youtube.com/watch?v=ZouSJWXFBPk
1•KolmogorovComp•45m ago•0 comments

Rent-Only Copyright Culture Makes Us All Worse Off

https://www.eff.org/deeplinks/2026/01/rent-only-copyright-culture-makes-us-all-worse
6•hn_acker•45m ago•0 comments

Show HN: Memcachex, a high-performance Memcached client for Go

https://github.com/atsegelnyk/memcachex
1•atsegelnyk•46m ago•1 comments

Utah Continues to Ban More Books, Even as It Racks Up More Lawsuits

https://www.techdirt.com/2026/01/22/utah-continues-to-ban-more-books-even-as-it-racks-up-more-law...
5•hn_acker•46m ago•0 comments

Kona: Energy-Based Models (EBMs) for AI Reasoning

https://logicalintelligence.com/kona-ebms-energy-based-models
2•gfortaine•48m ago•0 comments

Revealjs-skill: a better way for Claude to make presentations

https://github.com/ryanbbrown/revealjs-skill
1•ryanbbrown•50m ago•0 comments

Stunnel

https://www.stunnel.org/
3•firesteelrain•51m ago•0 comments

Vibe a Guitar Pedal

https://polyend.com/endless/
7•mulhoon•52m ago•7 comments

Four Ingredients for Successful Retrofitting

https://bmin.ai/retrofitting/
1•nl•53m ago•0 comments

TikTok Strikes Deal for New U.S. Entity, Ending Long Legal Saga

https://www.nytimes.com/2026/01/22/technology/tiktok-deal-oracle-bytedance-china-us.html
6•jbegley•58m ago•0 comments

Why medieval city-builder video games are historically inaccurate (2020)

https://www.leidenmedievalistsblog.nl/articles/why-medieval-city-builder-video-games-are-historic...
31•benbreen•58m ago•7 comments

WAForth: Forth Interpreter+Compiler for WebAssembly

https://github.com/remko/waforth
1•publicdebates•1h ago•0 comments

Clean Web UI for Steve Yegge's Beads

https://github.com/nmelo/bdui
1•nmelo•1h ago•0 comments