frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Why stop at 1M tokens when you can have 10M?

2•Zen_Sherbert•4h ago
To start us off, I'm going to make a ridiculous claim.

On my 7800XT gaming GPU, using less than 3GB of VRAM for the buffer, I have built an architecture that can process a 10 million token context.

This is not a joke. You can run it in a Google Colab notebook, on a free T4, and prove it to yourself right now:

The Proteus Playground https://colab.research.google.com/github/Zen-Sherbert/Proteus-Attention/blob/main/TinyPlayground.ipynb

It runs flawlessly on both CUDA and ROCm. It works. With the proof-of-concept out of the way, here are the three core ideas that got me here.

1. DNA - Tokens have value.

My journey started with a simple idea: tokens mean something. They have value. So why don't we use it?

I built a system called DNA, where each attention "gate" learns a "taste" for certain tokens and pulls them in like gravity. The crazy part? On a raw, untrained model, I found that 334 out of 500 tokens were already being caught by this system. It's a natural, emergent behavior.

2. The Alpha Slider - "Why can't I just change my model?"

I hated that I couldn't just switch my model from dense, to sparse, to linear whenever I wanted. So, I built a custom Triton kernel to do exactly that.

The result is a single knob called alpha:

Dense, high-fidelity? alpha = 0.0.

Balanced sub-quadratic? alpha = 0.3.

Screaming-fast linear time? alpha = 1.0 and the attention mechanic goes brrrrrr.

3. Chunking & RoPE - "So I got rid of it."

My new systems got me far, but the VRAM bottleneck was still a headache. So I got rid of it.

The idea is simple: chunking. Break a massive context into small pieces, shunt them to system RAM, and use a tiny VRAM buffer for only the most important tokens.

DNA tells us what's important. As a Hail Mary, I added RoPE to preserve where it came from. This combination creates contextual teleportation. It allows the model to create a perfect "highlight reel" and reason over it as if critical facts, separated by thousands of pages, were sitting right next to each other. It's your own little wormhole across data space.

TL;DR: I built an extreme context system that costs less than Minecraft to run. Would love feedback, as I'm still exploring how far it can go.

Github: https://github.com/Zen-Sherbert/Proteus-Attention/tree/main

Comments

Zen_Sherbert•4h ago
A little bit about the origin story for those who are interested:

This whole thing started with me trying to implement sparsity, and getting it totally wrong. The DNA idea came to me in the middle of the night during my shift as an asset protection officer. The rest of it was just fumbling from one discovery to the next, mostly ignoring the "right" way to do things.

I'm an 8-year veteran, a father of three, and I just finished my bachelor's. I am not an AI researcher. If I can build this, you can do something infinitely better.

Please, try the Colab. Break it. Play with it. I implore you to tell me how it breaks. I'm excited to see what the community thinks.

Astronomers may have found the first stars that formed after the Big Bang

https://phys.org/news/2025-11-astronomers-stars-big.html
1•bikenaga•1m ago•0 comments

New Book on Threat Modelling CAVs

https://figshare.com/articles/book/Introduction_to_Threat_Modelling_applied_to_Connected_and_Auto...
1•fl4tul4•2m ago•1 comments

Dare Market Hands Out Crypto If You Complete Potentially Humiliating Pranks

https://www.pcmag.com/news/this-site-hands-out-crypto-if-you-complete-potentially-humiliating-pranks
1•gpi•2m ago•0 comments

Show HN: AI Test Reviewer for PRs – Finds gaps in your existing tests

https://www.middlerok.com/
1•rokontech•3m ago•0 comments

Show HN: I analyzed 44 OSS dev tools revenue model matters more than stars

https://www.pext.org/research/oss-economics
1•askadityapandey•3m ago•1 comments

New Infrastructure-as-Code Tool "Formae" Takes Aim at Terraform

https://www.infoq.com/news/2025/10/iac-formae/
1•mooreds•3m ago•0 comments

LLM's Report Subjective Experience Under Self-Referential Processing

https://www.arxiv.org/pdf/2510.24797
1•gradus_ad•4m ago•0 comments

The Laws of Externalized Authorization (2024)

https://docs.google.com/presentation/d/1LVOldPHlLdosOJM73_e6pM0qdiA9uH6L/edit?rtpof=true&sd=true
2•mooreds•4m ago•0 comments

Server DRAM prices surge 50% as AI-induced memory shortage hits hyperscalers

https://www.tomshardware.com/pc-components/storage/server-dram-prices-surge-50-percent
3•walterbell•5m ago•0 comments

Dermatology's Disastrous War Against the Sun

https://www.midwesterndoctor.com/p/dermatologys-disastrous-war-against-f81
2•bilsbie•6m ago•0 comments

Why A16Z's new Ayn Rand-style brand sums up the firm perfectly

https://www.fastcompany.com/91327237/a16zs-new-ayn-rand-style-logo-is-the-perfect-embodiment-of-i...
2•kvam•8m ago•1 comments

A professional-grade dependency injection container for TypeScript

https://github.com/webiny/di
1•bilekas•8m ago•1 comments

Coding with agents is good but I feel so empty

https://shub.club/writings/2025/november/ai-agents-are-good-i-m-bored-now/
1•forthwall•9m ago•0 comments

Autark: Rethinking build systems – Integrate, Don't Outsource

https://blog.annapurna.cc/posts/autark-intro/
1•adamansky•9m ago•0 comments

In Decade Since Paris Agreement, Climate Outlook Has Improved Dramatically

https://e360.yale.edu/digest/paris-agreement-2025-emissions-outlook
1•Brajeshwar•10m ago•0 comments

How to Pick a Career (That Fits You)

https://waitbutwhy.com/2018/04/picking-career.html
1•Brajeshwar•10m ago•0 comments

At 23: From failing university in Turkey to AI research in Germany

https://lightcapai.medium.com/stability-trust-and-the-myth-of-luck-lessons-from-my-journey-a3e53e...
1•HenryAI•10m ago•0 comments

A humble weed became a superstar of biology

https://knowablemagazine.org/content/article/living-world/2025/how-humble-weed-became-superstar-p...
1•Brajeshwar•10m ago•0 comments

Use these 7 Linux commands to keep your system tidy and fast

https://www.howtogeek.com/use-these-linux-commands-to-keep-your-system-tidy-and-fast/
1•losgehts•11m ago•0 comments

RPi $5–$10 price increases for some 4GB and 8GB products

https://www.raspberrypi.com/news/5-10-price-increases-for-some-4gb-and-8gb-products/
3•walterbell•11m ago•0 comments

AI currently automates 2.5% remote jobs

https://getsuperintel.com/p/remote-labor-index-rli
2•ostenbom•11m ago•0 comments

Show HN: Convosphere – GeoChat app to talk with people nearby anonymously

https://convosphere.app
1•jothetaha•12m ago•0 comments

Show HN: Oodle – Unified Debugging with OpenSearch and Grafana

https://blog.oodle.ai/meet-oodle-unified-and-ai-native-observability/
1•kirankgollu•12m ago•0 comments

OpenAI ChatKit Review: Technical Deep Dive and Why We Didn't Adopt It

https://quickchat.ai/post/openai-chatkit-review
4•piotrgrudzien•14m ago•0 comments

Prog8

https://github.com/irmen/prog8
2•erickhill•15m ago•0 comments

Show HN: I got fired so I built a bank statement converter

https://aussiebankstatements.com
2•matherslabs•15m ago•0 comments

Transportation Companies Hacked to Steal Cargo

https://www.securityweek.com/transportation-companies-hacked-to-steal-cargo/
2•Bender•15m ago•0 comments

Researchers demonstrate Agent2Agent prompt injection risk

https://www.scworld.com/news/researchers-demonstrate-agent2agent-prompt-injection-risk
1•Bender•16m ago•0 comments

Chrome FedCM updates: Display iframe domain

https://developer.chrome.com/blog/fedcm-chrome-142-updates
2•mooreds•16m ago•0 comments

A YouTube Education

https://jmarriott.substack.com/p/a-youtube-education
1•bookofjoe•18m ago•0 comments