frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: The Hessian of tall-skinny networks is easy to invert

https://github.com/a-rahimi/hessian
11•rahimiali•1h ago
It turns out the inverse of the Hessian of a deep net is easy to apply to a vector. Doing this naively takes cubically many operations in the number of layers (so impractical), but it's possible to do this in time linear in the number of layers (so very practical)!

This is possible because the Hessian of a deep net has a matrix polynomial structure that factorizes nicely. The Hessian-inverse-product algorithm that takes advantage of this is similar to running backprop on a dual version of the deep net. It echoes an old idea of Pearlmutter's for computing Hessian-vector products.

Maybe this idea is useful as a preconditioner for stochastic gradient descent?

Comments

MontyCarloHall•23m ago
>If the Hessian-vector product is Hv for some fixed vector v, we're interested in solving Hx=v for x. The hope is to soon use this as a preconditioner to speed up stochastic gradient descent.

Silly question, but if you have some clever way to compute the inverse Hessian, why not go all the way and use it for Newton's method, rather than as a preconditioner for SGD?

rahimiali•9m ago
Good q. The method computes Hessian-inverse on a batch. When people say "Newton's method" they're often thinking H^{-1} g, where both the Hessian and the gradient g are on the full dataset. I thought saying "preconditioner" instead of "Newton's method" would make it clear this is solving H^{-1} g on a batch, not on the full dataset.
MontyCarloHall•8m ago
I'd call it "Stochastic Newton's Method" then. :-)
rahimiali•3m ago
fair. thanks. i'll sleep on it and update the paper if it still sounds right tomorrow.

probably my nomenclature bias is that i started this project as a way to find new preconditioners on deep nets.

jeffjeffbear•12m ago
I haven't looked into it in years, but would the inverse of a block bi-diagonal matrix have some semiseperable structure? Maybe that would be good to look into?
rahimiali•5m ago
just to be clear, semiseparate in this context means H = D + CC', where D is block diagonal and C is tall & skinny?

If so, it would be nice if this were the case, because you could then just use the Woodbury formula to invert H. But I don't think such a decomposition exists. I tried to exhaustively search through all the decompositions of H that involved one dummy variable (of which the above is a special case) and I couldn't find one. I ended up having to introduce two dummy variables instead.

Remails: A European Mail Transfer Agent

https://tweedegolf.nl/en/blog/197/remails
1•Flundstrom2•1m ago•0 comments

Neutral-atom arrays, a rapidly emerging quantum computing platform, gets a boost

https://phys.org/news/2026-01-neutral-atom-arrays-rapidly-emerging.html
1•rbanffy•1m ago•0 comments

Surprise Finding: Immune System May Keep Us from Burning Fat

https://www.medscape.com/viewarticle/surprise-finding-immune-system-may-keep-us-burning-fat-2026a...
1•wjb3•4m ago•1 comments

Will Your AI Teammate Bring Bagels to Standup?

https://redmonk.com/kholterhoff/2026/01/16/will-your-ai-teammate-bring-bagels-to-standup/
1•mooreds•4m ago•0 comments

ADBC: An Intro to NextGen Database Connections

https://thefulldatastack.substack.com/p/adbc-an-intro-to-nextgen-database
1•nhemerson•4m ago•0 comments

First 'dark factory' where robots build the car tipped to open by 2030

https://www.autonews.com/technology/ane-fully-automated-car-plant-china-us-0115/
1•rmason•4m ago•0 comments

Show HN: Visualize Python binary dependencies and subprocess calls in a browser

https://surfactant.readthedocs.io/en/latest/pypi_dependency_analyzer.html
1•rmast•4m ago•0 comments

Innovations in energy and finance are further inflating the AI bubble

https://www.economist.com/business/2026/01/15/innovations-in-energy-and-finance-are-further-infla...
1•1vuio0pswjnm7•5m ago•0 comments

Social Web Working Group Charter

https://www.w3.org/2026/01/social-web-wg-charter.html
1•bovermyer•6m ago•0 comments

HackerTrain/groups: Issue-only repository/pseudo-forum for organising groups

https://codeberg.org/HackerTrain/groups
1•edward•7m ago•0 comments

Tab, Tab, Dead

https://ampcode.com/news/tab-tab-dead
1•herczegzsolt•8m ago•1 comments

The Space and Motion of Communicating Agents (2008) [pdf]

https://www.cl.cam.ac.uk/archive/rm135/Bigraphs-draft.pdf
1•dhorthy•10m ago•0 comments

Don't fall into the anti-AI hype – <antirez>

https://davidcel.is/links/2011919384662230086-dont-fall-into-the-anti-ai-hype-antirez
1•frizlab•12m ago•3 comments

How we built CoPE

https://blog.zentropi.ai/how-we-built-cope/
2•erlend_sh•12m ago•0 comments

Two Thinking Machines Lab Cofounders Are Leaving to Rejoin OpenAI

https://www.wired.com/story/thinking-machines-lab-cofounders-leave-for-openai/
2•monkeydust•14m ago•1 comments

Nuudel: Non-Tracking Appointment Tool

https://nuudel.digitalcourage.de/
1•doener•14m ago•0 comments

There Is No Green Transition, and This Book Explains Why

https://www.highspeed.blog/too-much-more/
2•doener•17m ago•0 comments

Iran's internet shutdown is now one of its longest ever, as protests continue

https://techcrunch.com/2026/01/15/irans-internet-shutdown-is-now-one-of-its-longest-ever-as-prote...
13•ukblewis•18m ago•2 comments

María Corina Machado says she presented Trump with her Nobel peace prize medal

https://www.theguardian.com/world/2026/jan/15/maria-corina-machado-says-she-presented-trump-with-...
2•vinni2•18m ago•2 comments

DHS used neo-nazi anthem for recruitment after fatal Minneapolis shooting

https://theintercept.com/2026/01/13/dhs-ice-white-nationalist-neo-nazi/
3•anigbrowl•18m ago•1 comments

Yacv (Yet Another Compiler Visualizer): LL and LR Parser Animations

https://github.com/ashutoshbsathe/yacv
1•fanf2•23m ago•0 comments

Releasing Rainbow Tables to Accelerate Net-NTLMv1 Protocol Deprecation

https://cloud.google.com/blog/topics/threat-intelligence/net-ntlmv1-deprecation-rainbow-tables
2•notmine1337•23m ago•0 comments

A Powerful New Stealth Model from a Top OSS Lab

https://blog.kilo.ai/p/announcing-a-powerful-new-stealth
1•emschwartz•24m ago•0 comments

Show HN: React hook for real-time voice with Gemini Live API

https://github.com/deflectionrate/gemini-live-react
1•loffloff•26m ago•0 comments

Favorite Rust Crates of 2025

https://docs.freestyle.sh/blog/rust-crates-2025
3•benswerd•27m ago•0 comments

Porsche Restored This 20-Year-Old Carrera GT to 'Zero-Kilometer Condition'

https://www.thedrive.com/news/porsche-restored-this-20-year-old-carrera-gt-to-zero-kilometer-cond...
1•PaulHoule•31m ago•1 comments

We built a free cross-app AI assistant inspired by Apple Intelligence

https://www.gethelios.xyz/
1•rogermas•32m ago•1 comments

Show HN: A WebGPU-based browser engine with "Blam "-style physics

3•goovbot•32m ago•0 comments

WP-Bench: A WordPress AI Benchmark

https://make.wordpress.org/ai/2026/01/14/introducing-wp-bench-a-wordpress-ai-benchmark/
2•chilipepperhott•37m ago•0 comments

The Cost of PostgreSQL Arrays

https://boringsql.com/posts/good-bad-arrays/
3•birdculture•39m ago•0 comments