The Problem:Standard Federated Learning (FL) hits a wall at scale. When you move from a few hundred nodes to 500,000, two things happen: communication overhead explodes ($O(n)$ or $O(n^2)$), and the "honest majority" assumption falls apart. Most BFT systems (like PBFT or HotStuff) are hard-capped at <33% or <50% malicious actors.The Breakthrough:I developed the Sovereign-Mohawk Protocol. In a stress test conducted yesterday, it successfully coordinated 500,000 nodes in 4 minutes and 8 seconds, maintaining model accuracy even with 55.5% of nodes acting maliciously (gradient poisoning and sybil attacks).How it works (The TL;DR):Hierarchical Streaming Aggregation: Instead of a central parameter server, Mohawk uses a tree-based batching architecture. This drops communication complexity to $O(d \log n)$.Tiered Rényi Differential Privacy: I integrated DP directly into the consensus layer. By using Rényi DP ($\epsilon = 0.98$), we can filter outliers (malicious gradients) more aggressively than standard median-based aggregators.zk-SNARK Verifiability: Every aggregation step generates a 200-byte proof. The central coordinator can verify the integrity of 500,000 contributions in constant time without re-computing the gradients.The Stress Test Results (Feb 24, 2026):40% Byzantine: 86.6% accuracy | 9.1s avg round time.50% Byzantine: 85.8% accuracy | 10.5s avg round time.55.5% Byzantine: 81.0% accuracy | 9.9s avg round time (The theoretical "Mohawk Limit").Why Solo?I wanted to prove that Sovereign AI infrastructure doesn't require a Google-sized team. This implementation is written in Go with a Wasmhost, allowing it to run on anything from an NVIDIA Jetson to an Apple Silicon NPU.Links:Repo: Sovereign Map Federated LearningResearch/Docs: Sovereign-Mohawk Protocol SiteI'm particularly looking for feedback on the BFT boundary proofs. Is $55.5\%$ the absolute limit for DP-weighted aggregation, or can we push to 60% with higher noise injection?
Comments
lerp-io•1h ago
from what i understand there is tradeoff for this approach bc u dont have access to raw data compression is only as good as the model that compresses the data and it can also be decompressed/inverted from gradient to reconstruct whatever the raw data is....
lerp-io•1h ago