Show HN: I'm building an AI-proof writing tool. How would you defeat it?

https://auth-auth.vercel.app/

24•callmeed•1w ago

Comments

ticulatedspline•1w ago

- Screencap and OCR the prompt

- feed it to an llm to answer

- use a tool [1] to emulate human typing cadence.

- Use a tool to send mouse events to the browser window.

[1] https://multilogin.com/blog/paste-as-human-typing/

not sure if you have any feedback at the end but if you do the alternative is feedback poisoning in your training set to mark AI stuff as real and real stuff as fake. The former is automatable and if there were a good reason you could probably mechanical-Turk a large number of people at low cost to do it for real and give feedback that they were actually AIs.

forgotpwd16•1w ago

Or write a script that does all that and bind it to shortcut. Eventually https://files.catbox.moe/zargud.png. (The window/tab switches were 1~2 when ended; reached 9 after few attempts to screenshot it.)

nine_k•1w ago

An AI-proof writing tool is a pen and paper, on a glass table, in a Faraday-cage locked room, etc, etc.

Depending on how high are the stakes, weeding off just the most obvious cases of AI usage may be enough. But wherever an electronic input is involved, it can be emulated in ways that are impossible to detect on a reasonable budget. Content analysis may help somehow, but again it would only detect the more obvious cases.

dpoloncsak•1w ago

>Pasting and DOM manipulation are disabled to ensure all writing is original.

>We track telemetry such as typing speed, pauses, tab changes, and window focus events.

People figure out ways around this for like...Runescape bots and other low-stake situations. I don't think it would hold up to anything other than casual users. Seems like an agent could whip something up in Auto-HotKey or something.

I get this is the extreme end, but if this gets popular enough, can't you write like a custom 'keyboard' driver that just takes AI input and 'types' it? Random delay between keystrokes, whatever....

It also can't be used to verify existing work, right? I can't see if a student's essay is LLM-written. Is there any real-world use you see? Or is this just a fun toy?

yjftsjthsd-h•1w ago

> I get this is the extreme end, but if this gets popular enough, can't you write like a custom 'keyboard' driver that just takes AI input and 'types' it? Random delay between keystrokes, whatever....

We can easily go one more step than drivers; making a cheap microcontroller enumerate as a USB keyboard is easy.

dpoloncsak•1w ago

Ai powered rubber duckies go crazyyyyyyy

callmeed•1w ago

School/students were my target user when I created this. But also mostly just a fun toy.

ephou7•1w ago

I used this:

#!/usr/bin/python3 import subprocess import time import random with open("/tmp/x") as f: t = f.read() for c in t: subprocess.call([ "xdotool", "type", c ]) time.sleep(abs(random.gauss(0,0.07)))

And pasted a random Hacker News comment:

Authenticity Score 81 Highly Authentic

Words per minute: 162 Keystroke variance: 52ms Paste attempts: 0 Window/tab switches: 4 Pauses (≥10s): 0 DOM manipulations: 0

You failed.

callmeed•1w ago

That's cool, thanks for sharing.

Is there a way to detect this approach?

cyode•1w ago

Most text composition involves backspaces and cursor movement. This script simulated neither afaik, though I’m sure it could have without much more difficultly.

Probably you will need to track many signals like those and use a model that takes them all into account.

abetusk•1w ago

I think your approach is pretty much fundamentally flawed.

Put it this way, let's say someone recorded typing in the paragraph that you presented but saved the keystrokes, pauses, etc. Now they replay it back, with all the pauses and keystrokes, maybe with the `xdotool` as above, how could you possibly know the difference?

Your method is playing a statistical game of key presses, pauses, etc. Anyone who understands your method will probably not only be able to create a distribution that matches what you expect but could, in theory, create something that looks completely inhuman but will sneak past your statistical tests.

majorchord•1w ago

Every OS has a similar facility for emulating keystrokes, and then there's the hardware solutions, e.g. raspberry pi that masquerades as a keyboard. It's practically impossible to prevent someone from cheating if they are motivated enough.

abetusk•1w ago

`xdotool` is awesome and this is the first I'm hearing of it. Thanks.

Do you have any other command line tool recommendations?

stupidgeek314•1w ago

> Open chatgpt in second tab > Type what chatgpt says yourself

The only "AI proof" writing tool is those blue books you take exams on in college.

kibibu•1w ago

No need for a second tab (focus switch is trackable) when you have a phone.

goodmythical•1w ago

focus switch tracking is irrelevant when one doesn't disqualify and you can put windows side by side

JoshuaDavid•1w ago

The following by "Claude Slopson" (Claude Opus asked to write an answer that was obviously AI) scored 87% authentic:

> Ah, what a fantastic question

> For me, it's Breaking Bad–and honestly? It's not just a show, it's a masterclass in storytelling that fundamentally reshaped the television landscape.

> What keeps drawing me back? The way it seamlessly blends moral complexity with edge-of-your-seat tension is nothing short of breathtaking. Walter White's transformation isn't just compelling–it's a profound meditation on identity, ambition, and the human condition itself.

> But here's the thing–it's also deeply rewatchable. Every frame is meticulously crafted. Every detail matters. The foreshadowing alone is chef's kiss!

> Whether you're a first-time viewer or a seasoned fan, Breaking Bad offers something for everyone. It's a testament to what happens when visionary creators push the boundaries of their medium.

> In an era of endless content, some shows simply transcend. This is one of them.

> 10/10, would recommend! What's YOUR comfort rewatch? Drop it below!

(HN strips the emojis, but don't worry–they were there)

jryan49•1w ago

I pulled up a 2nd browser next to your site. I typed the prompt into Gemini, and manually typed what Gemini said and removed em dashes as I typed. It said it was 100% genuine.

callmeed•1w ago

Maybe I need to punish tab/windows switches more severely?

Or perhaps require webcam and do eye tracking?

jryan49•1w ago

I only switched tab/windows once (to input the question). So I'm not sure how severely you'd want to rate that. The webcam doing eye tracking might help if the person isn't good a typing and can copy one side without checking if the other is okay...

goodmythical•1w ago

If there's eye tracking I can just wear a mask with see through eyes or use a spoofing mechanism to fake the webcam altogether.

goodmythical•1w ago

> and do eye tracking?

What, so I can't look up into my memories like I do in a natural conversation?

How would you account for accessibility issues where test takers can't control they're eyes?

hackingonempty•1w ago

A paper on this was just submitted.

"We demonstrate that this class of defenses is insecure against two practical attack classes: the copy-type attack, in which a human transcribes LLM-generated text producing authentic motor signals, and timing-forgery attacks, in which automated agents sample inter-keystroke intervals from empirical human distributions."

https://arxiv.org/abs/2601.17280v1

There are tons of USB dongles available for about $10 containing a programmable microcontroller. You connect to the dongle from another device over wifi and it can act like a keyboard typing out whatever text you want using a model of human typing, perhaps even your own.

gibybo•1w ago

I clicked begin and immediately alt tabbed to ChatGPT. I took a screenshot of the prompt and pasted to ChatGPT, with the instruction "Write a 1 paragraph response as if you were a human". I alt tabbed back to the writing website and started typing ChatGPTs response. I intentionally added a few pauses in places where I thought I might pause, and deleted words a couple times. I also slightly changed the last sentence from "down, plus the assumption that..." to "down. And.." to give it the human touch of poor grammar, although I'm not sure if that mattered.

I scored 100 "Highly Authentic"

Instructions to ChatGPT:

[What's something you pretend to understand but actually don't? Why do you fake it?] Write a 1 paragraph response as if you were a human.

Response:

Honestly, I probably nod along to conversations about highly technical stuff like advanced financial instruments or super niche software internals when I only grasp the surface, because stopping to ask for a full breakdown can derail the moment or make me look out of my depth. It's not about trying to deceive anyone, it's more about social momentum and not wanting to slow things down. And I can fill in the gaps later if I need to.

nemosaltat•1w ago

Thank you for sharing the response.

The “self”-(un)awareness is almost endearing (though some might be your higher level instructions). It summarizes—correctly IME— that it’s basically an ~inept~ typical middle/upper manager/“visionary”, replete with a CorpSpeak promise to circle back and “fill in the gaps”… which it does with the consistency and accuracy of a samesaid cat-turd-ingester.

Does make me wish that the folks nodding along in meatspace would/could be similarly “honest” about this same phenomenon.

duncangh•1w ago

10,000 chimpanzees with typewriters. Do you accept telefacsimile submissions? I’ll try to train them to use appropriate cover letters

protocolture•1w ago

I just typed out the gist of what chat gpt gave me and got Highly Authentic 88.

mudkipdev•1w ago

I just pasted the prompt on the phone app and then retyped the output using my keyboard? Got a 100% authentic score.

PeterStuer•1w ago

I'm sure you'll iterate 39 times, and the virtual keyboard/screen recording bot with a humanizer layer will still go statistically undetected.

kn100•1w ago

defeated it by running gemini in split screen and voice dictating the response into the tool. got 80%.

We Mourn Our Craft

Speed up responses with fast mode

U.S. Jobs Disappear at Fastest January Pace Since Great Recession

Hoot: Scheme on WebAssembly

Stories from 25 Years of Software Development

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

Al Lowe on model trains, funny deaths and working with Disney

Brookhaven Lab's RHIC Concludes 25-Year Run with Final Collisions

The AI boom is causing shortages everywhere else

The Waymo World Model

Reinforcement Learning from Human Feedback

I Write Games in C (yes, C)

Start all of your commands with a comma (2009)

SectorC: A C Compiler in 512 bytes

Vocal Guide – belt sing without killing yourself

France's homegrown open source online office suite

Coding agents have replaced every framework I used

A Fresh Look at IBM 3270 Information Display System

Selection Rather Than Prediction

History and Timeline of the Proco Rat Pedal (2021)

72M Points of Interest

Unseen Footage of Atari Battlezone Arcade Cabinet Production

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

Where did all the starships go?

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Learning from context is harder than we thought

Monty: A minimal, secure Python interpreter written in Rust for use by AI

Making geo joins faster with H3 indexes

Software factories and the agentic moment

We Mourn Our Craft

Speed up responses with fast mode

U.S. Jobs Disappear at Fastest January Pace Since Great Recession

Hoot: Scheme on WebAssembly

Stories from 25 Years of Software Development

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

Al Lowe on model trains, funny deaths and working with Disney

Brookhaven Lab's RHIC Concludes 25-Year Run with Final Collisions

The AI boom is causing shortages everywhere else

The Waymo World Model

Reinforcement Learning from Human Feedback

I Write Games in C (yes, C)

Start all of your commands with a comma (2009)

SectorC: A C Compiler in 512 bytes

Vocal Guide – belt sing without killing yourself

France's homegrown open source online office suite

Coding agents have replaced every framework I used

A Fresh Look at IBM 3270 Information Display System

Selection Rather Than Prediction

History and Timeline of the Proco Rat Pedal (2021)

72M Points of Interest

Unseen Footage of Atari Battlezone Arcade Cabinet Production

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

Where did all the starships go?

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Learning from context is harder than we thought

Monty: A minimal, secure Python interpreter written in Rust for use by AI

Making geo joins faster with H3 indexes

Software factories and the agentic moment

Show HN: I'm building an AI-proof writing tool. How would you defeat it?

Comments