Hey HN, I built this because my AI agent was spending 8 seconds and 300MB of RAM just to search X. That felt wrong — the data is right there behind one HTTP request, but the "standard" approach is to launch a full browser, render the page, and scrape the DOM.
web2cli makes direct HTTP requests using your browser cookies. No Chromium, no Selenium, no headless anything. The tricky part was TLS fingerprinting - Cloudflare blocks Python's default TLS stack (JA3 fingerprint mismatch), so web2cli uses curl_cffi with BoringSSL to impersonate Chrome at the TLS level. X.com was even harder - their search requires a cryptographic nonce
generated by obfuscated browser JS, which the community reverse-engineered.
Six adapters today: HN, X, Discord, Slack, Stack Overflow, Reddit. Each adapter is a YAML file - writing a new one takes ~30 minutes (or ~3 minutes for your coding agent) and doesn't require Python code for most sites (although it's possible to add a custom python provider, like I did with X).
I'm working on web2cli Cloud - think "OAuth for sites that don't have OAuth." Your users log in via a sandboxed browser, your agent gets an opaque session token, cookies never touch your server.
Happy to go deep on the adapter architecture, anti-bot bypasses, or the economics of browser automation vs direct HTTP.
pancsta•37m ago
„Every website” == 6 websites. I like the table layout of the results tho.
michaeloblak•26m ago
Ha, fair point — "every website" is the vision, 6 is the MVP :)
The adapter model is designed so adding a new site is a single YAML file (~30 min of work, or ~3 min with a coding agent). No Python code needed for most sites. PRs welcome if there's a site you'd want to see!
michaeloblak•1h ago
web2cli makes direct HTTP requests using your browser cookies. No Chromium, no Selenium, no headless anything. The tricky part was TLS fingerprinting - Cloudflare blocks Python's default TLS stack (JA3 fingerprint mismatch), so web2cli uses curl_cffi with BoringSSL to impersonate Chrome at the TLS level. X.com was even harder - their search requires a cryptographic nonce generated by obfuscated browser JS, which the community reverse-engineered.
Six adapters today: HN, X, Discord, Slack, Stack Overflow, Reddit. Each adapter is a YAML file - writing a new one takes ~30 minutes (or ~3 minutes for your coding agent) and doesn't require Python code for most sites (although it's possible to add a custom python provider, like I did with X).
I'm working on web2cli Cloud - think "OAuth for sites that don't have OAuth." Your users log in via a sandboxed browser, your agent gets an opaque session token, cookies never touch your server.
Happy to go deep on the adapter architecture, anti-bot bypasses, or the economics of browser automation vs direct HTTP.