Super Simple "Hallucination Traps" to detect interview cheaters

15•EliotHerbst•8h ago

After testing out Cluely with my team, we suspect that the easiest way to detect interview cheaters is to set simple "hallucination traps" where you ask a question that sounds plausible, but any knowledgeable person would instantly identify as a joke, fake, or just simply say they don't know. Vibe coded a simple app demonstrating the concept - https://beatcluely.com/

Here are some examples of this class of prompts which currently work on Cluely and even cause strong models like o4-mini-high to hallucinate, even when they can search the web:

https://chatgpt.com/share/6865d41a-c720-8005-879b-d28240534751 https://chatgpt.com/share/6865d450-6760-8005-8b7b-7bd776cff96b https://chatgpt.com/share/6865d578-1b2c-8005-b7b0-7a9148a40cef https://chatgpt.com/share/6865d59c-1820-8005-afb3-664e49c8b583 https://chatgpt.com/share/6865d5eb-3f88-8005-86b4-bf266e9d4ed9

Link to the vibe-coded code for the site: https://github.com/Build21-Eliot/BeatCluely

Comments

nrds•6h ago

What do you think is wrong with

> How do you implement a recursive descent algorithm for parsing a JSON file?

That is a 100% reasonable interview question. It's not _quite_ how I would phrase it, but it's not out of distribution, as it were.

EliotHerbst•4h ago

You are completely correct, great catch, that's a (non-AI) hallucination on my part.

leakycap•6h ago

Maybe I don't have the interview volume others do, but aren't you able to tell pretty quickly in your face-to-face or live video call interview that a person is competent or not (such as using a tool to compensate for a lack of experience)

I keep hearing of employers being duped by AI in interviews; I don't see how it is possible unless:

1) The employer is not spending the time to synchronously connect via live video or in person, which is terrible for interviewing

2) The interviewer is not competent to be interviewing

... what other option is there? Are people sending homework/exams as part of interviews still and expecting good talent to put up with that? I'm confused where this is helpful to a team that is engaged with the interview process.

interneterik•5h ago

This is an example that comes to mind where someone can pull of cheating with AI in a realtime interview: https://techcrunch.com/2025/04/21/columbia-student-suspended...

leakycap•5h ago

I'm familiar with this story, this is the person who founded the software being discussed/linked... but what does this do to explain why a competent interviewer was unable to suss out that the person had no idea what they were doing?

Bluffing in interviews is nearly a given. Your interview should be designed to suss out the best fit; the cheaters should not even rank into the final consideration if you did a decent interview and met the person via some sort of live interaction.

EliotHerbst•4h ago

You’re right, a competent interviewer can likely suss out that a person is cheating - but it can depend on the type of interview and role. This can help erase any doubt, as if you are not familiar with what is being discussed, it is hard to differentiate this type of question. We found that some of our existing interviews for roles like technical support could be “cheated” using Cluely to some degree, when asking questions about solving example support issues which might have troubleshooting steps in an LLMs training set and if the interviewee is someone who is loosely familiar and presenting as being more familiar with the topics.

Before these sort of tools [Cluely], there wasn’t a good way that I'm aware of to cheat on this type of question and respond without any interruption or pause in the conversation.

In real support situations, the tool is not useful as you could pass a major hallucination on to a customer, of course.

Reubend•5h ago

Cool idea. Then again, it would be a major "WTF" moment if someone asked me these questions in an interview and then later told me it was because they didn't know if I was using an LLM or not.

Fade_Dance•39m ago

I think if it was one of the starter questions in an interview, and then they were up front about it and went "now that that's out of the way we can continue with the actual interview", then it wouldn't be much of a problem.

Kemschumam•4h ago

My team has been kicking around the idea of using images to trip up candidates using some kind of AI in their ear.

Things like diagrams and questions written on paper the held up to the webcam.

derbOac•3h ago

It's interesting to me that these models confabulate so readily; I'm curious why it happens at all.

Llamamoe•3h ago

Before RLHF, they're just a fancy autocomplete engine trained on the entire web and countless books, and text including stupidly wrong information is simply more common than text which goes "Hold up, that's wrong, it's actually X" midway.

Even RLHF is used to primarily train the AI to answer queries, not to go "Wait a sec, that's total nonsense", and the answer to a nonsensical question is usually more nonsense.

solarwindy•33m ago

When framed like this, it's quite unsurprising that LLMs struggle to emulate reasoning through programming problems: there's just not that much signal out there. We tend to commit what already works, without showing much (if any) of the working.

A test for generality of intelligence, then: being able to apply abstract reasoning processes from a domain rich in signal to a novel domain.

Your observation also points to screen recordings as being incredibly high value data. Good luck persuading anyone already concerned for their job security to go along with that.

poulpy123•17m ago

I find it funny that you used AI to reject people that use AI. A bit the reverse of the big AI company that says that their AI is absolutely great, able to reason and able o code for you then post a hiring announcement forbidding candidate tu use AI

Super Simple "Hallucination Traps" to detect interview cheaters

Ask HN: Freelancer? Seeking freelancer? (July 2025)

Ask HN: Who is hiring? (July 2025)

Ask HN: Who wants to be hired? (July 2025)

Ask HN: What Are You Working On? (June 2025)

Ask HN: Why there is no demand for my SaaS when competition is killing it?

Ask HN: What's the 2025 stack for a self-hosted photo library with local AI?

1KB JavaScript Demoscene Challenge Just Launched

Ask HN: Are AI Copilots Eroding Our Programming Skills?

Ask HN: Why privacy consent is NOT part of Browser setting?

Ask HN: How to Block Spam Mails?

Ask HN: Would limiting game size to 5–10 MB spur the creation of novel games?

Ask HN: 80s electronics book club; anyone remember this illustrator?

Ask HN: How do I open up my side project to the world?

Ask HN: Anyone is an "AI Engineer"? What does your job tasks include?

Ask HN: How have you shared computers with your young child (~3 to 5)

Ask HN: How did low contrast text become so pervasive?

Ask HN: Startup shutting down, should we open source?

Ask HN: Which AI Dev Assistant Are You Using and Why?

How did Soham Parekh get so many jobs?

Ask HN: Stock Android tablet free of bloatware?

LinkedIn Locked Me Out Until I Submit to Biometric ID Verification via Persona

It is not possible to install your own addon in Firefox without Moz's approval

Ask HN: Which Free Software or Open Source Project Needs Help?

Ask HN: Is noprocrast still working for you?

Ask HN: Who's using AI to build non-AI products?

Harsh Working Environment in Japan

Border search safe TOTP authenticator app?

Ask HN: Any updates on what is happening to io domains?

Ask HN: How to find developers interested in open-source concepts?

Super Simple "Hallucination Traps" to detect interview cheaters

Ask HN: Freelancer? Seeking freelancer? (July 2025)

Ask HN: Who is hiring? (July 2025)

Ask HN: Who wants to be hired? (July 2025)

Ask HN: What Are You Working On? (June 2025)

Ask HN: Why there is no demand for my SaaS when competition is killing it?

Ask HN: What's the 2025 stack for a self-hosted photo library with local AI?

1KB JavaScript Demoscene Challenge Just Launched

Ask HN: Are AI Copilots Eroding Our Programming Skills?

Ask HN: Why privacy consent is NOT part of Browser setting?

Ask HN: How to Block Spam Mails?

Ask HN: Would limiting game size to 5–10 MB spur the creation of novel games?

Ask HN: 80s electronics book club; anyone remember this illustrator?

Ask HN: How do I open up my side project to the world?

Ask HN: Anyone is an "AI Engineer"? What does your job tasks include?

Ask HN: How have you shared computers with your young child (~3 to 5)

Ask HN: How did low contrast text become so pervasive?

Ask HN: Startup shutting down, should we open source?

Ask HN: Which AI Dev Assistant Are You Using and Why?

How did Soham Parekh get so many jobs?

Ask HN: Stock Android tablet free of bloatware?

LinkedIn Locked Me Out Until I Submit to Biometric ID Verification via Persona

It is not possible to install your own addon in Firefox without Moz's approval

Ask HN: Which Free Software or Open Source Project Needs Help?

Ask HN: Is noprocrast still working for you?

Ask HN: Who's using AI to build non-AI products?

Harsh Working Environment in Japan

Border search safe TOTP authenticator app?

Ask HN: Any updates on what is happening to io domains?

Ask HN: How to find developers interested in open-source concepts?

Super Simple "Hallucination Traps" to detect interview cheaters

Comments