frontpage.

Hey all, first time posting. I posted about this on linkedin and a friend said I should post here too

I run a lot of headless Screaming Frog crawls on servers.

The main bottleneck is that while the SF CLI can consume configuration files (.seospiderconfig), it cannot produce them. If you want to run a crawl with complex settings (like custom extractions or specific excludes), you are forced to open the desktop GUI, configure it manually, save the file, and upload it.

You can't just script the file generation because the configs are serialized Java objects (binary blobs), not JSON or XML.

I decided to reverse engineer it. A hex dump confirmed the format was standard Java serialization. Instead of writing a fragile parser, I realized I could use the application's own JARs to handle the heavy lifting.

I built two tools to solve this:

Python Library: Uses JPype to bridge Python to the local SF JARs. You can instantiate config objects, modify them (e.g., config.set_user_agent(...)), and serialize them back to disk. Great for Airflow/Python pipelines.

Java Utility: A standalone CLI tool to do the same thing if you prefer a native Java environment or don't want the Python overhead.

What this enables:

True Headless Automation: Generate valid configs on the fly right before a crawl runs.

Diffing: Compare two binary config files to debug "config drift" (e.g., seeing exactly why a crawl limit changed).

Feedback welcome—especially on the JPype implementation, as that was the trickiest part to stabilize!

LineageOS 23.2

Crypto Deposit Frauds

Substack makes money from hosting Nazi newsletters

Framing an LLM as a safety researcher changes its language, not its judgement

Are there anyone interested about a creator economy startup

Show HN: Skill Lab – CLI tool for testing and quality scoring agent skills

2003: What is Google's Ultimate Goal? [video]

Roger Ebert Reviews "The Shawshank Redemption"

Busy Months in KDE Linux

Zram as Swap

Green’s Dictionary of Slang - Five hundred years of the vulgar tongue

Nvidia CEO Says AI Capital Spending Is Appropriate, Sustainable

Show HN: StyloShare – privacy-first anonymous file sharing with zero sign-up

Part 1 the Persistent Vault Issue: Your Encryption Strategy Has a Shelf Life

Show HN: Teleop_xr – Modular WebXR solution for bimanual robot teleoperation

The Highest Exam: How the Gaokao Shapes China

Open-source framework for tracking prediction accuracy

India's Sarvan AI LLM launches Indic-language focused models

Show HN: CryptoClaw – open-source AI agent with built-in wallet and DeFi skills

ShowHN: Make OpenClaw respond in Scarlett Johansson’s AI Voice from the Film Her

CReact Version 0.3.0 Released

Show HN: CReact – AI Powered AWS Website Generator

The rocky 1960s origins of online dating (2025)

Show HN: Agent-fetch – Sandboxed HTTP client with SSRF protection for AI agents

Why there is no official statement from Substack about the data leak

Effects of Zepbound on Stool Quality

Show HN: Seedance 2.0 – The Most Powerful AI Video Generator

Ask HN: Do we need "metadata in source code" syntax that LLMs will never delete?

Pentagon cutting ties w/ "woke" Harvard, ending military training & fellowships

Can Quantum-Mechanical Description of Physical Reality Be Considered Complete? [pdf]

Show HN: Python lib and Java CLI tool to read/write Screaming Frog config files