Ask HN: How are you doing technical interviews in the age of Claude/ChatGPT?

5•jonjou•2h ago

I’m a founder/dev trying to figure out a better way to do technical interviews, because the current state is a nightmare.

Right now, every standard take-home or HackerRank/LeetCode test is easily solved by LLMs. As a result, companies are accidentally hiring what we call vibe coders, candidates who are phenomenal at prompting AI to generate boilerplate, but who completely freeze when the architecture gets complex, when things break, or when the AI subtly hallucinates.

We are working on a new approach and I want to validate the engineering logic with the people who actually conduct these interviews.

Instead of trying to ban AI (which is a losing battle), we want to test for "AI Steering".

The idea: 1. Drop the candidate into a real, somewhat messy sandbox codebase.

2. Let them use whatever AI they want.

3. Inject a subtle architectural shift, a breaking dependency, or an AI hallucination.

4. Measure purely through telemetry (Git diffs, CI/CD runs, debugging paths) how they recover and fix the chaos.

Basically: Stop testing syntax, start testing architecture and debugging skills in the age of AI.

Before we spend months building out the backend for this simulation, I need a reality check from experienced leads: 1. Does testing a candidate's ability to "steer" and debug AI-generated code make more sense to you than traditional algorithms?

2. How are you currently preventing these "prompt-only" developers from slipping through your own interview loops?

(Not linking anything here because there's nothing to sell yet, just looking for brutal feedback on the methodology.)

Comments

dakiol•1h ago

> 1. Does testing a candidate's ability to "steer" and debug AI-generated code make more sense to you than traditional algorithms?

Testing the candidate's ability to "steer" agents seems to be like testing their ability to know the Java API or to recite SOLID by heart.

> 2. How are you currently preventing these "prompt-only" developers from slipping through your own interview loops?

We don't ask anymore leetcode. We keep the usual systems design interview in which usage of AI is not needed (or at least we don't allow it because in this kind of interview we are more interested in seeing how the candidate thinks and so on)

We have a new stage in our job interview, though: generic Q/A about the fundamental of software engineering/computer science. Again, we don't care anymore how candidates produce code. We care about what they know, and what they don't know. What's the scope of their knowledge, and when do they need to rely on AI to come up with an answer. Silly (non-real) example: "Can you write a program that detects if another program halts?". The people we want are the ones who would say something about the Halting Problem but also perhaps be practical and perhaps ask more questions about such a program requirements.

You get the point: we look for people with a good breadth of knowledge, who can communicate well and know their shit. Whether they can use tool x or y (including LLMs), comes for granted for such people

jonjou•46m ago

This is a fantastic perspective, thank you. You hit the nail on the head: the ultimate goal is testing fundamental engineering breadth and systems thinking, not tool usage.

I should definitely clarify my use of the word steering — I completely agree that testing prompt engineering is just the new API memorization, which is useless.

By steering, I mean putting them in a situation where the AI generates a plausible but architecturally flawed solution, and seeing if they have the fundamental knowledge to spot the BS, understand the scope of the problem, and fix it.

Basically, an automated way to test the exact critical thinking you mentioned.

I love your approach of dropping LeetCode for fundamentals Q/A and Systems Design. But out of curiosity, how do you scale that at the top of the funnel? Doing deep, manual 1-on-1 assessments gives the best signal by far, but doesn't that burn a massive amount of your senior engineers' time?

When Science Goes Agentic

Java 26 is here, and with it a solid foundation for the future

The Los Angeles Aqueduct Is Wild

Consent.txt – compile one AI policy into robots.txt, AIPREF, and headers

Women are being abandoned by their partners on hiking trails

Show HN: Chrome extension that hijacks any site's own API to modify it

Reducing quarantine delay 83% using Genetic Algorithms for playbook optimization

Node.js blocks PR from dev because he used Claude Code to create it

Python 3.15's JIT is now back on track

Remote Control for Agents

Danger Coffee: Mold-Free Remineralized Coffee Replaces What Regular Coffee Takes

Building a dry-run mode for the OpenTelemetry collector

LotusNotes

Austin draws another billionaire as Uber co-founder joins California exodus

Deep Data Insights for Polymarket Traders

Show HN: A simple dream to fit in every traveler's pocket

Rockstar Games stopped selling its digital games directly to players in Brazil

The US-Israeli strategy against Iran is working. Here is why

John Carmack on corporate advisory boards

Microsoft Announces Copilot Leadership Update

Designing an AI Gateway and Durable Workflow System

A text-only social platform, with custom algorithm for users

Show HN: Automatic Fileless Malware Detection via eBPF Probes and LLMs

Kagi's Orion browser hits public beta on Linux

A Big Pharma Company Stalled a Potentially Lifesaving Vaccine

Nvidia Just Made the Claw Enterprise-Ready

Notes from a Law Professor with No Idea What's Going On

Benchmarking Distilled Language Models for Performance and Efficiency

Show HN: A complete, containerized data engineering learning platform

Search Quality Assurance with AI as a Judge