frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Summation-based aggregation: a simpler alternative to self-attention

https://www.techrxiv.org/users/954609/articles/1334505-summation-based-transformers-a-path-toward-linear-complexity-sequence-modeling?commit=b942083b403d62e7d56e45e5012cd375f1e12ddf
2•pfekin_2nd•2h ago

Comments

pfekin_2nd•2h ago
Summation-based aggregation replaces pairwise similarity with position-modulated projections and direct summation, reducing per-layer cost from quadratic to near-linear.

On its own, summation is competitive for classification and multimodal tasks. In language modeling, a hybrid design — summation in most layers with a single final attention layer — matches or slightly outperforms full attention while staying nearly linear in cost.

GitHub: https://github.com/pfekin/summation-based-transformers

pfekin_2nd•2h ago
Author here — a few clarifications up front:

How is this different from Performer / linear attention? Performer and related methods approximate the softmax kernel with random features or low-rank projections. Summation is not an approximation — it eliminates similarity altogether. Tokens are modulated by positional encodings, projected with nonlinearities, and aggregated by direct addition.

Does pure summation replace attention? In classification and multimodal regression, yes — summation alone is competitive and often faster. In autoregressive language modeling, pure summation underperforms. But a hybrid design (summation in most layers + a single final attention layer) matches or slightly beats full attention while keeping most of the network near-linear.

What scale are the experiments? Small-to-moderate scale (document classification, WikiText-2, AG News, etc.). Scaling laws remain an open question — collaboration on larger-scale validation is very welcome.

Why might this work? Summation acts as a bottleneck: only task-relevant features survive aggregation, which seems to restructure embeddings before the final attention layer stabilizes them. PCA and dimensionality analyses show distinctive representation dynamics compared to attention.

Qosh Tepa Canal: over 100 miles, Taliban largest engineering project

https://en.wikipedia.org/wiki/Qosh_Tepa_Canal
1•vinnyglennon•2m ago•0 comments

Cloudflare Turns 15: The Origin Story from Its Co-Founders

https://www.youtube.com/watch?v=I5m27tO78Ek
1•emot•2m ago•0 comments

'Send a clear message': law firm's dirty tactics on behalf of $4B crypto scam

https://www.thebureauinvestigates.com/stories/2025-09-23/send-a-clear-message-law-firms-dirty-tac...
3•latein•3m ago•1 comments

Europe gets another year of Windows 10 updates for free [pdf]

https://www.euroconsumers.org/wp-content/uploads/2025/09/Euroconsumers_vs_Microsoft_092025.pdf
2•teekert•4m ago•0 comments

Components as Data

https://medium.com/@nathanacurtis/components-as-data-2be178777f21
1•vinaythoke•5m ago•0 comments

Send (real) money with Python in 49 currencies

https://vineyard-payments.com/send-real-money-with-python-in-49-currencies/
2•adamkurkiewicz•10m ago•2 comments

Multi-tenant dashboards in SaaS: how Embeddable solved security and scale

https://embeddable.com/blog/multi-tenant-dashboards-in-saas-how-embeddable-handles-security-and-s...
1•hjkm•10m ago•0 comments

Show HN: A simple file conversion API

https://converthub.com/api
1•venelinkochev•11m ago•0 comments

Software substrates: should there be only one? [pdf]

https://www.humprog.org/~stephen/research/papers/kell25substratus.pdf
1•todsacerdoti•12m ago•0 comments

Inflammatory pain in mice has light cycle-dependent effects on sleep

https://www.nature.com/articles/s41386-025-02152-w
1•PaulHoule•12m ago•0 comments

The surprisingly lucrative business of making a list of 500 stocks

https://www.npr.org/sections/planet-money/2025/09/23/g-s1-90054/how-does-the-s-p-500-work
1•andsoitis•12m ago•0 comments

Crypto billionaire is launching first private space station

https://www.cnn.com/science/vast-worlds-first-commercial-space-station-spc
1•huhtenberg•14m ago•1 comments

From Vienna, with Open Source: XDC 2025

https://www.collabora.com/news-and-blog/news-and-events/from-vienna-with-open-source-xdc-2025.html
1•losgehts•15m ago•0 comments

Typer by FastAPI – build great CLIs

https://github.com/fastapi/typer
1•subset•16m ago•0 comments

Pixi global: now with desktop shortcuts and CLI autocompletions

https://prefix.dev/blog/using-pixi-as-a-system-package-manager-with-shortcuts-and-completions
1•todsacerdoti•16m ago•0 comments

Warren Buffett's Berkshire Hathaway Exits China's BYD, Filing Shows

https://www.reuters.com/business/autos-transportation/warren-buffetts-berkshire-hathaway-exits-ch...
3•ironyman•17m ago•0 comments

LLM models pass CFA level III exam

https://www.cfabenchmark.com/
1•geox•20m ago•0 comments

Ask HN: How would you design a business model that supports plugin-writers?

1•aethertap•22m ago•1 comments

Show HN: Synced.it – A2A group scheduling (10k students using it)

https://synced.it
1•nickharty•24m ago•0 comments

Time series foundation models can be few-shot learners

https://research.google/blog/time-series-foundation-models-can-be-few-shot-learners/
1•just_human•24m ago•1 comments

Autodesk Increases APS Pricing

https://aps.autodesk.com/blog/aps-business-model-evolution
2•nsoonhui•26m ago•0 comments

Nvidia's $100B deal with OpenAI: a hilarious FT Alphaville FAQ

https://www.ft.com/content/7f1426ab-9f70-44e0-bb06-d83df348b64b
2•jmsflknr•26m ago•1 comments

Coffee-badging and other quiet revolts: How workers defy in-office mandates

https://thehill.com/opinion/technology/5515819-office-attendance-reality-gap/
1•rufus_foreman•27m ago•1 comments

The Risk (and Opportunity) of MCP Sampling

https://owlfort.io/blog/the-risk-and-opportunity-of-mcp-sampling
1•brazukadev•27m ago•0 comments

List of Super Mario Bros. Glitches

https://www.mariowiki.com/List_of_Super_Mario_Bros._glitches
1•andsoitis•28m ago•0 comments

Michigan's Anticorruption of Public Morals Act Could Ban VPNs

https://reason.com/2025/09/22/michigan-anti-porn-bill-would-criminalize-asmr-written-erotica-and-...
6•miohtama•29m ago•1 comments

Climate Trace: Air pollution map

https://climatetrace.org/air-pollution
1•OutOfHere•30m ago•0 comments

Trump's H-1B visa caper will backfire (Michael Moritz)

https://www.ft.com/content/9fc8f69e-52dc-4552-bcde-e39eb6b0df84
2•cs702•30m ago•0 comments

Show HN: Tunn – Manage all your SSH tunnels from a YAML file

https://github.com/strandnerd/tunn
1•strandnerd•34m ago•0 comments

Major Droughts Coincided with Classic Maya Collapse

https://eos.org/articles/major-droughts-coincided-with-classic-maya-collapse
2•sohkamyung•41m ago•0 comments