Ask HN: A proposal for interviewing "AI-Augmented" Engineers

3•vanbashan•16h ago

Hi HN,

I’m currently rethinking our hiring process. Like many of you, I feel that traditional algorithmic tests (LeetCode style) are becoming less relevant now that LLMs can solve them instantly. Furthermore, prohibiting AI during interviews feels counter-productive; I want to hire engineers who know how to use these tools effectively to multiply their output.

I am designing a new evaluation framework based on real-world open-source work, and I would love the community’s feedback on whether this sounds fair, effective, or if I’m missing something critical.

The Core Philosophy: We shouldn't test if a candidate can write syntax better than an AI. We should test if they can guide, debug, and improve upon an AI's output to handle the "last mile" of complex engineering.

The Proposed Process:

1. Task Selection (Real World Context) Instead of synthetic puzzles, we select open issues or discussions from public GitHub repositories that share a tech stack with our product.

    Scope: 2–4 hours.

    Types: Implementing a feature based on a discussion, fixing a bug, or reviewing a PR (specifically one that was eventually rejected, to test "taste").

    Ambiguity: Adjusted for seniority. Junior roles get clear specs; senior roles get vague problem statements requiring architectural decisions.

2. Establishing the "AI Baseline" Before giving the task to a candidate, we run it through current SOTA models with minimal human intervention.

    The Filter: If the AI solves it perfectly on the first try, we discard the task.

    The Sweet Spot: We are looking for tasks where the AI gets 80% right but fails on edge cases, context integration, or complex logic. The problem setup should not be too easy or too hard.

3. The Candidate Test Candidates are required to use their preferred AI coding tools. We ask them to submit not just the code, but their chat/prompt history.

How We Evaluate (The "AI Delta"):

We aren't just looking at the final code. We analyze the "diff" between the Candidate’s process and our "AI Baseline":

    1. Exploration Strategy: How does the candidate "load context"? Do they blindly paste errors, or do they guide the AI to understand the repository structure first? We look for a clear understanding of the existing codebase.

    2. Engineering Rigor (TDD): Does the candidate push the AI to generate a test plan or reproduction script before generating the fix? We value candidates who treat the AI as a junior partner that needs verification.

    3. The "Last 10%" (Edge Cases): Since we picked tasks where AI fails slightly, we look at how the candidate handles those failure modes. Can they spot the boundary conditions and logic errors that the LLM glossed over?

    4. Documentation Hygiene: We specifically check if the candidate instructs the AI to search existing documentation and—crucially—if they prompt the AI to update the docs to reflect the new changes.

    5. Engineering Taste (The Rejected PR): For the code review task, we ask them to analyze a PR that was rejected in the real world (without telling them). We want to see if their reasoning for rejection aligns with our team's engineering culture (maintainability, complexity, clarity, etc.).

My Questions for HN:

    Is analyzing the "Chat History" too invasive, or is it the best way to see their thought process in 2026?

    For those of you hiring now, how do you distinguish between a "prompt kiddie" and a senior engineer who is just very good at prompting?

    Does the 2-4 hour time commitment feel reasonable for a "take-home" if the tooling makes the actual coding faster?

Thanks for your insights!

(Full disclosure: In the spirit of this topic, this post was composed by AI based on my draft notes.)

Comments

raw_anon_1111•4h ago

I interview like I always interview - behaviorally.

I filter for “smart and gets things done” (Joel Spolsky circa 2001).

“tell me about the project that you are most proud of” and then we talk about the architecture, tradeoffs, technical and business complexities, etc.

“I see you’ve been working for $x years. I’m sure there is a project you look back on knowing what you know now and cringe. Tell me about the project and what would you do differently?”

There are a few other questions. But I am usually also trying to measure soft skills and what level of scope and ambiguity they are comfortable with. The last thing I’ve ever needed when I am looking to hire is another “ticket taker”.

Even before AI, why would ever hire a junior dev? They are practically useless, do negative work and easy enough to poach someone with experience from another company for only slightly more money if you paying standard enterprise dev wages.

Ask HN: Is there anyone here who still uses slide rules?

Ask HN: Who wants to be hired? (February 2026)

Ask HN: Do you still use physical calculators?

Ask HN: Who is hiring? (February 2026)

Signal Is Down

Ask HN: Anyone have a "sovereign" solution for phone calls?

Kernighan on Programming

Ask HN: OpenClaw users, what is your token spend?

Ask HN: Have you been fired because of AI?

Best practices for powering and wiring addressable LED strip installs?

My small SaaS got recommended my Google in the AI search overview

Ask HN: What weird or scrappy things did you do to get your first users?

Ask HN: Where do all the web devs talk?

GitHub Actions Have "Major Outage"

Ask HN: Why dead code detection in Python is harder than most tools admit

CiderStack – Native macOS VM manager, pay once, no subscription

Google Cloud suspended my account for 2 years, only automated replies

Ask HN: Are you still using spec driven development?

Ask HN: Request limits vs. token limits for AI-powered apps?

Ask HN: Is anyone losing sleep over retry storms or partial API outages?

Ask HN: Has anybody moved their local community off of Facebook groups?

Ask HN: Anyone else struggle with how to learn coding in the AI era?

Ask HN: Interest in low cost / fast container registry?

Latex-wc: word count and word frequency for LaTeX projects

Ask HN: A proposal for interviewing "AI-Augmented" Engineers

Ask HN: Who is firing? (February 2026)

Why do people still talk about AGI?

Ask HN: Why are customer feedback boards so static? Building a live alternative

Ask HN: What are the immediate/near/long-term non-corporate benefits of AI?

Ask HN: Junior getting lost

Ask HN: A proposal for interviewing "AI-Augmented" Engineers

Comments

Ask HN: Is there anyone here who still uses slide rules?

Ask HN: Who wants to be hired? (February 2026)

Ask HN: Do you still use physical calculators?

Ask HN: Who is hiring? (February 2026)

Signal Is Down

Ask HN: Anyone have a "sovereign" solution for phone calls?

Kernighan on Programming

Ask HN: OpenClaw users, what is your token spend?

Ask HN: Have you been fired because of AI?

Best practices for powering and wiring addressable LED strip installs?

My small SaaS got recommended my Google in the AI search overview

Ask HN: What weird or scrappy things did you do to get your first users?

Ask HN: Where do all the web devs talk?

GitHub Actions Have "Major Outage"

Ask HN: Why dead code detection in Python is harder than most tools admit

CiderStack – Native macOS VM manager, pay once, no subscription

Google Cloud suspended my account for 2 years, only automated replies

Ask HN: Are you still using spec driven development?

Ask HN: Request limits vs. token limits for AI-powered apps?

Ask HN: Is anyone losing sleep over retry storms or partial API outages?

Ask HN: Has anybody moved their local community off of Facebook groups?

Ask HN: Anyone else struggle with how to learn coding in the AI era?

Ask HN: Interest in low cost / fast container registry?

Latex-wc: word count and word frequency for LaTeX projects

Ask HN: A proposal for interviewing "AI-Augmented" Engineers

Ask HN: Who is firing? (February 2026)

Why do people still talk about AGI?

Ask HN: Why are customer feedback boards so static? Building a live alternative

Ask HN: What are the immediate/near/long-term non-corporate benefits of AI?

Ask HN: Junior getting lost