Ask HN: How are you doing technical interviews in the age of Claude/ChatGPT?

6•jonjou•20h ago

I’m a founder/dev trying to figure out a better way to do technical interviews, because the current state is a nightmare.

Right now, every standard take-home or HackerRank/LeetCode test is easily solved by LLMs. As a result, companies are accidentally hiring what we call vibe coders, candidates who are phenomenal at prompting AI to generate boilerplate, but who completely freeze when the architecture gets complex, when things break, or when the AI subtly hallucinates.

We are working on a new approach and I want to validate the engineering logic with the people who actually conduct these interviews.

Instead of trying to ban AI (which is a losing battle), we want to test for "AI Steering".

The idea: 1. Drop the candidate into a real, somewhat messy sandbox codebase.

2. Let them use whatever AI they want.

3. Inject a subtle architectural shift, a breaking dependency, or an AI hallucination.

4. Measure purely through telemetry (Git diffs, CI/CD runs, debugging paths) how they recover and fix the chaos.

Basically: Stop testing syntax, start testing architecture and debugging skills in the age of AI.

Before we spend months building out the backend for this simulation, I need a reality check from experienced leads: 1. Does testing a candidate's ability to "steer" and debug AI-generated code make more sense to you than traditional algorithms?

2. How are you currently preventing these "prompt-only" developers from slipping through your own interview loops?

(Not linking anything here because there's nothing to sell yet, just looking for brutal feedback on the methodology.)

Comments

dakiol•19h ago

> 1. Does testing a candidate's ability to "steer" and debug AI-generated code make more sense to you than traditional algorithms?

Testing the candidate's ability to "steer" agents seems to be like testing their ability to know the Java API or to recite SOLID by heart.

> 2. How are you currently preventing these "prompt-only" developers from slipping through your own interview loops?

We don't ask anymore leetcode. We keep the usual systems design interview in which usage of AI is not needed (or at least we don't allow it because in this kind of interview we are more interested in seeing how the candidate thinks and so on)

We have a new stage in our job interview, though: generic Q/A about the fundamental of software engineering/computer science. Again, we don't care anymore how candidates produce code. We care about what they know, and what they don't know. What's the scope of their knowledge, and when do they need to rely on AI to come up with an answer. Silly (non-real) example: "Can you write a program that detects if another program halts?". The people we want are the ones who would say something about the Halting Problem but also perhaps be practical and perhaps ask more questions about such a program requirements.

You get the point: we look for people with a good breadth of knowledge, who can communicate well and know their shit. Whether they can use tool x or y (including LLMs), comes for granted for such people

jonjou•19h ago

This is a fantastic perspective, thank you. You hit the nail on the head: the ultimate goal is testing fundamental engineering breadth and systems thinking, not tool usage.

I should definitely clarify my use of the word steering — I completely agree that testing prompt engineering is just the new API memorization, which is useless.

By steering, I mean putting them in a situation where the AI generates a plausible but architecturally flawed solution, and seeing if they have the fundamental knowledge to spot the BS, understand the scope of the problem, and fix it.

Basically, an automated way to test the exact critical thinking you mentioned.

I love your approach of dropping LeetCode for fundamentals Q/A and Systems Design. But out of curiosity, how do you scale that at the top of the funnel? Doing deep, manual 1-on-1 assessments gives the best signal by far, but doesn't that burn a massive amount of your senior engineers' time?

jarl-ragnar•17h ago

We binned any form of coding questions or take home task. Our preference now is to focus on architectural concepts and ability to apply those to a problem.

So, as happened last week, if I’m interviewing for an Elixir dev I’m going to be interested in your knowledge of the BEAM and how it’s features can be used to solve common architectural problems.

raw_anon_1111•16h ago

Absolutely nothing changes about how I interview. I care whether you are “smart and gets things done”.

“Tell me about the project that you are most proud of?” And then dig in and asking them about their challenges decision making processes, and gauge the level of scope, impact and ambiguity they know how to work at.

“I see you have been working for $x years. Knowing what you know now, what would you do differently?”

“Say you are in a meeting with myself, the CEO and other senior developers who have been at the company for awhile and we all agree on an idea that in your experience you know is a bad idea, what would you do?”

Follow up question: “What would you do if after we listened to you, we decided to go in another direction?”

“Tell me about a time when you had unclear requirements , how did you handle it?” - gets back to ambiguity. There is a lot of that with startups.

austin-cheney•15h ago

As a hiring manager here is what I do when I do interviews:

1. I don't do anything with code. No whiteboard. I usually don't even ask code related questions. Code literacy is a silent prerequisite. You wouldn't waste a lawyer's time asking if they could read books.

2. I talk to people. Soft skills are more important than tech skills. I want to see how they converse, their confidence, and their level of comfort.

3. I do ask technical questions and I ask them fast. It takes AI time to generate an answer. I am looking at speed of response. If the candidate needs to pause to think about a creative answer that's fine, because I am watching for that too. When they pause are they reading something from a screen or are they using their body language to convey they are using their imagination. I also pivot on the fly to other subjects between questions.

4. When I ask technical questions I try to make them as open ended as possible so that there is no single right answer. I want to know what the candidate would do in a given scenario. If I circle back on the same question 20 minutes later will I get the same answer?

5. I can usually make a determination about a candidate within 7-9 minutes of talking with them, but I schedule an hour to really make it more of a conversation and dive a bit deeper.

6. I don't really care about ASD in my current line of work, because in enterprise API management you will spend the majority of your time performing requirements discovery/analysis as opposed to writing code. So, it really is about the soft skills. If I were to go back to JavaScript I would absolutely look for ASD in the interview. Here is what I would look for:

6a. Frequency of first person pronouns. Does the candidate like to talk about themselves or is it more about a product or skill?

6b. Can the candidate measure things? Ask them to demonstrate some manner of technical performance analysis with numbers.

6c. Ask questions to inform of a bias. Does the candidate prefer objectivity when building something from scratch or must everything, even the most trivial of things, be familiar and comforting.

6d. On a scale between reckless and safety where the candidate land and how do they control for it?

As the candidate answers my questions I am really only just listening enough to know where to pivot for the next question in rudely short time. Most of what I am paying attention to is where the candidate's eyes go between talking to me versus answering a challenging question. I am also looking for the level of relaxation in the candidate's body and the stress in their voice.

prateeksi•1h ago

We're building a CLOB-based DEX and this problem hits close, we've interviewed a lot of smart contract developers where AI makes it nearly impossible to assess real understanding. What we've shifted to: giving candidates a live matching engine with a subtle bug in the order execution logic and asking them to find and fix it. Prompting Claude gives you something that looks right but breaks under edge cases, partial fills, price-time priority violations, self-trade prevention. The candidates who actually understand the mechanics catch it. The ones steering AI without real knowledge submit a fix that breaks three other things. Your "AI steering" framing is exactly right. The real skill now is knowing when the output is plausible but wrong and that only comes from genuine domain knowledge.

Ask HN: What can we do about explosion of self promotion in HN comments

Ask HN: Is Claude down Again?

Ask HN: What breaks first when your team grows from 10 to 50 people?

Skills Manager – manage AI agent skills across Claude, Cursor, Copilot

Ask HN: What is it like being in a CS major program these days?

Ask HN: How is AI-assisted coding going for you professionally?

What's your biggest challenge as a founder?

Tell HN: AI tools are making me lose interest in CS fundamentals

Open AI is actively censoring information about voting today in the US

Ask HN: How do you handle payments for AI agents?

Apple Screen Sharing High Performance

Claude Code 500s

Claude Is Having an Outage

Ask HN: How are you doing technical interviews in the age of Claude/ChatGPT?

Ask HN: We need to learn algorithm when there are Claude Code etc.

It feels like Claude goes down almost daily now

Ask HN: Have you successfully treated forward head posture ("nerd neck")?

Ask HN: Did GitHub remove Opus and Sonnet from their Copilot Pro subscription?

I'm 60 years old. Claude Code killed a passion

Ask HN: How to Learn C++ in 2026?

Tell HN: Godaddy DNS resolution down for 2+ hours

Do you really need an agent?

Winstwaker – automated bookkeeping with a real accountant attached

Tons of new LLM bot accounts here

Ask HN: What signals do you look for when hiring?

Ask HN: Why can't we just make more RAM?

AI coding agents accidentally introduced vulnerable dependencies

Ask HN: What was it like for programmers when spreadsheets became ubiquitous?

Who is using Ollama day-to-day?

MiniMax M2.5 is trained by Claude Opus 4.6?

Ask HN: How are you doing technical interviews in the age of Claude/ChatGPT?

Comments

Ask HN: What can we do about explosion of self promotion in HN comments

Ask HN: Is Claude down Again?

Ask HN: What breaks first when your team grows from 10 to 50 people?

Skills Manager – manage AI agent skills across Claude, Cursor, Copilot

Ask HN: What is it like being in a CS major program these days?

Ask HN: How is AI-assisted coding going for you professionally?

What's your biggest challenge as a founder?

Tell HN: AI tools are making me lose interest in CS fundamentals

Open AI is actively censoring information about voting today in the US

Ask HN: How do you handle payments for AI agents?

Apple Screen Sharing High Performance

Claude Code 500s

Claude Is Having an Outage

Ask HN: How are you doing technical interviews in the age of Claude/ChatGPT?

Ask HN: We need to learn algorithm when there are Claude Code etc.

It feels like Claude goes down almost daily now

Ask HN: Have you successfully treated forward head posture ("nerd neck")?

Ask HN: Did GitHub remove Opus and Sonnet from their Copilot Pro subscription?

I'm 60 years old. Claude Code killed a passion

Ask HN: How to Learn C++ in 2026?

Tell HN: Godaddy DNS resolution down for 2+ hours

Do you really need an agent?

Winstwaker – automated bookkeeping with a real accountant attached

Tons of new LLM bot accounts here

Ask HN: What signals do you look for when hiring?

Ask HN: Why can't we just make more RAM?

AI coding agents accidentally introduced vulnerable dependencies

Ask HN: What was it like for programmers when spreadsheets became ubiquitous?

Who is using Ollama day-to-day?

MiniMax M2.5 is trained by Claude Opus 4.6?