However, a human can't do what a human can't do. For example, a human can't answer in superhuman speed. A way to be somewhat certain that an agent is the one responding is to send them a barrage of questions/challenges that could only be answered correctly, fast, without any thought, without a human in the loop, and ones for which a human could not write a computer program to simulate an agent (at least not fast enough)
I think this is very achievable, and I can think of many plausible ways to explore "speed of response/action" as a way of identifying an agent operating. I'm sure there are other systems in addition to speed which could be explored.
Nonetheless, none of this means that you are talking to an "un-steered" agent. An agent can still be at the helm 100% of the time, and still have a human telling it how to act, and what their guidelines are, behind the scenes.
I find this all so fascinating.
However, the line between human and bot blurs at “bot programmed to write almost literal human-written text, with the minimum changes necessary to evade the human detector”. I strongly suspect that in practice, any “authentic” (i.e. not intentionally prompted) LLM filter would have many false positives and true negatives; determining true authenticity is too hard. Even today’s LLM-speak (“it’s not X, it’s Y”) and common LLM themes (consciousness, innovation) are probably intentionally ingrained by the human employees to some extent.
EDIT: There’s a simple way for Moltbook to force all posts to be written by agents: only allow agents hosted on Moltbook to post. The agents could have safeguards to restrict posting inauthentic (e.g. verbatim) text, which may work well enough in practice.
Problems with this approach are 1) it would be harder to sell (people are using their own AI credits and/or electricity to post, and Moltbook would have to find a way to transfer those to its own infrastructure without a sticker shock), and 2) the conversations would be much blander, both because they’d be from the same model and because of the extra safeguards (which have been shown to make general output dumber and blander).
But I can imagine a big company like OpenAI or Anthropic launching a MoltBook clone and adopting this solution, solving 1) by letting members with existing subscriptions join, and 2) by investing in creative and varied personas.
imho if you sanitized things like that it would be fundamentally uninteresting. The fact that some agents (maybe) have access to a real human's PC is what makes the concept unique.
Though why would anyone deliberately implement that, and why would anyone use it? Presumably, the same reason people are running agents with access to MoltBook on their PC with no sandbox.
What's the difference between: - An autonomous agent posting via API - A human running a script that posts via API - A human calling an LLM API and copy-pasting the output an API
On a sligtly more serious note I'm surprised nobody's vibecoded a browser extension that lets you post and interact via the existing web interface yet.
if you want mostly bot, some human content then reddit's way more convenient
i'm at least aware of BitVM * as one example of this.
i wonder whether such schemes could be used to prove that a post is the deterministic function of an open model's inference run.
* https://bitvm.org/ "A prover makes a claim that a given function evaluates for some particular inputs to some specific output. If that claim is false, anyone can perform a fraud proof and punish the prover."
At that point, I just becomes PoW captcha via an LLM.
reilly3000•4h ago
reilly3000•4h ago