frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Lost in the Middle at Birth: An Exact Theory of Transformer Context Bias

https://arxiv.org/abs/2603.10123
1•borundev•2h ago

Comments

borundev•2h ago
While the "Lost in the Middle" (LitM) phenomenon is well-documented empirically, it is usually attributed to training data distribution or the lack of long-range dependencies in common datasets.

In this paper, I show that LitM is actually present at initialization. By deriving an exact theory using the Jacobian Norm, I demonstrate that the characteristic U-shaped attention curve is a structural property of the Transformer architecture itself.

Key findings:

    Architectural Determinism: Even with random weights, the model is "born" prioritizing the start and end of sequences.

    Jacobian Norm Analysis: I use the Jacobian to measure how sensitive the output is to input tokens at different positions, showing a clear macroscopic bias.

    Pretraining vs. Initialization: I compare Qwen-2.5B at both stages to show that while training adds "content detectors" (local spikes), it does not remove the underlying global U-shape.
This suggests that "fixing" long-context retrieval might require rethinking the initialization or the softmax-attention geometry itself, rather than just scaling up training data.

I’m the author of the paper and would love to hear the community’s thoughts on whether this structural bias can ever truly be overcome within the standard Transformer paradigm.

yorwba•18m ago
I recommend asking a friend who's a better writer and mathematician than Claude Code to help you reorganize the paper so that there are no gaps in the argumentation and incorrect statements like "For a purely causal transformer without residuals, the gradient routed from the final token L to an earlier token j after H layers is given by the bottom row of the exponential Cesàro Matrix M^H" are replaced with mathematically correct descriptions.

Also have them check your experiments, because the description doesn't inspire confidence your (Claude's) implementation isn't flawed in ways that invalidate your results. In particular, "our experimental code utilizes a highly efficient one-pass scalar-probe surrogate" sounds fishy.

HydraDB raises $6.5M to kill vector DBs

https://twitter.com/contextkingceo/status/2032098309029220456
1•anshulbhide•1m ago•0 comments

Users protest as Google Antigravity price floats upward

https://www.theregister.com/2026/03/12/users_protest_as_google_antigravity/
1•speckx•2m ago•0 comments

A miniature magnet rivals behemoths in strength for the first time

https://www.newscientist.com/article/2518964-a-miniature-magnet-rivals-behemoths-in-strength-for-...
1•Brajeshwar•2m ago•0 comments

Cybersecurity AI: Hacking Consumer Robots in the AI Era (2026)

https://arxiv.org/abs/2603.08665
1•mdelmundo•3m ago•1 comments

APL Education, Innovation and Impact Grants Available

https://TheAPLTrust.us/wp-admin/install.php
1•sudleyplace•3m ago•1 comments

Search Engine for Blogs and Podcasts

https://feedle.world/
1•TigerUniversity•3m ago•0 comments

Nvidia AI-Q Reached \#1 on DeepResearch Bench I and II

https://huggingface.co/blog/nvidia/how-nvidia-won-deepresearch-bench
1•ibobev•3m ago•0 comments

POC JIT with Go (Plugins)

https://xnacly.me/posts/2024/jit-with-go/
1•ibobev•4m ago•0 comments

I Let AI Redesign a Landing Page. It Beat Our Human-Designed Version

https://www.crazyegg.com/blog/ai-vs-human-landing-page/
1•mooreds•4m ago•0 comments

BasedAgents open-source identity and reputation registry for AI agents

https://basedagents.ai/agents
1•maxcr•4m ago•1 comments

From Monolith to Microservices: The Redistribution of Complexity

https://www.ddhigh.com/en/2026/03/12/complexity-redistribution-from-monolith-to-microservices/
1•ibobev•4m ago•0 comments

Suburban school district uses license plate readers to verify student residency

https://www.nbcchicago.com/consumer/suburban-school-district-uses-license-plate-readers-to-verify...
5•josephcsible•4m ago•1 comments

I Was a 1x Coder at Best. AI Made Me a 0x Coder

https://decoupledlogic.com/2026/03/12/i-was-a-1x-coder-at-best-ai-made-me-a-0x-coder/
1•squidhunter•7m ago•0 comments

A new fiber giant takes shape as GFiber and Astound combine

https://www.lightreading.com/broadband/a-new-fiber-giant-takes-shape-as-gfiber-and-astound-combine
1•smurda•7m ago•0 comments

Show HN: AgentFork – Any repo, instantly runnable by AI agents or contributors

https://www.agentfork.dev/
1•judekim•7m ago•0 comments

Low-Latency Inference with Speculative Decoding on D-Matrix Corsair and GPU

https://gimletlabs.ai/blog/low-latency-spec-decode-corsair
1•nserrino•8m ago•0 comments

'A sobering preview': Extreme heat now affects one in three people globally

https://www.theguardian.com/environment/2026/mar/10/extreme-heat-study-global-warming-physical-ac...
2•laurex•8m ago•0 comments

Show HN: Oat Glassed – 11KB, no dep, almost semantic UI library make worse

https://github.com/good-lly/oat-glassed
1•neon_me•10m ago•1 comments

Show HN: I manually organized 1000 profiles (am I a dinosaur?)

1•vadelfe•11m ago•1 comments

Show HN: War News Wire – Real-Time Aggregator for the Iran-Israel-US Conflict

https://warnewswire.com
1•DrFisch•11m ago•1 comments

See You in Court

https://thezvi.substack.com/p/ai-159-see-you-in-court
1•7777777phil•11m ago•0 comments

The Shape of the Thing: Where we are, and what likely happens next

https://www.oneusefulthing.org/p/the-shape-of-the-thing
1•swolpers•13m ago•0 comments

Russian propaganda game glorifying war crimes in Ukraine released on Steam

https://old.reddit.com/r/BuyFromEU/comments/1rr7e5d/russian_propaganda_game_glorifying_war_crimes...
1•doener•13m ago•0 comments

Claude Code for the Semi-Reluctant, Somewhat Curious Rails Developer

https://robbyonrails.com/claude-code-curious-rails-developers/
1•robbyrussell•13m ago•0 comments

Show HN: Desktop conversation practice tool for serious language learners

https://lingle.ai/
1•andrewfhou•13m ago•0 comments

Verification URL to India's Higher Secondary Exam Resolves to Rickroll

https://timesofindia.indiatimes.com/city/delhi/rickrolling-in-board-exam-cbse-class-12-maths-pape...
1•srean•15m ago•0 comments

They Came to Spy on America. They Stayed to Coach Little League

https://www.politico.com/news/magazine/2026/03/07/soviet-spy-america-cold-war-00755831
1•colinprince•15m ago•0 comments

Unexplained Moscow internet blackouts spark fears of web censorship plan

https://www.theguardian.com/world/2026/mar/12/russia-internet-blackouts-walkie-talkies-moscow
2•laurex•15m ago•0 comments

Ask HN: What's your experience working on software for science?

2•temporalparts•15m ago•2 comments

Show HN: Developer Experience Newsletter

https://danilostojanovic.stoicdev.tech/the-patch
1•danesto•16m ago•0 comments