FeedFirewall runs client-side pattern matching against posts as you scroll. When it detects rage-bait, it adds an unobtrusive badge. Click the badge, and you see exactly which signals triggered: engagement bait phrases, inflammatory language, divisive framing, or platform-specific patterns. It also optionally detects AI-generated text content.
Technical details: pure client-side JS, no external API calls. Weighted scoring across pattern categories.
Works on Twitter/X, LinkedIn, Reddit (old + new), and YouTube comments. 100% local, no servers, no data collection, no accounts.
It's heuristic-based, not ML, so it has false positives and negatives. But honestly, for my own browsing, just having that moment of "oh, this post is trying to make me angry" before I engage has been worth it.
The extension is free and available in Chrome (https://chromewebstore.google.com/detail/feedfirewall/dchnce...) and Firefox (https://addons.mozilla.org/en-US/firefox/addon/feedfirewall/)
The main technical question I'm wrestling with is: is weighted keyword matching too naive for this, or are heuristics sufficient when the manipulation tactics are this formulaic?