frontpage.

Show HN: Gonfire – analyze Claude Code session logs to see how candidates think

1•abr0ahm•55m ago

When I graduated from a CS program in 2020, leetcode was basically a SWE entrance exam. Your ability to solve a coding puzzle thrown at you on the spot determined your fate.

Recently, I’ve interviewed for a handful of “AI Engineer” positions at several startups and I noticed a shift in the format of technical assessments. Timed OAs and live leetcoding have been replaced with a “case study” format where AI use is encouraged. These were the two main patterns I saw:

1. Take home: Candidate clones a github repo or receives a zip file with starter code and README. They complete the assignment according to the instructions using any tools or resources that they would like, the final code gets pushed up to a github repo and the user submits a link to the repo. The hiring team evaluates the submission.

2. Live assessment: Candidate is live on a call with an interviewer with screenshare. Candidate clones a github repo or receives a zip file with starter code and README instructions. The interviewer observes the candidate think out loud to assess how they solve the problem using AI.

Both of these formats still seem sub-optimal. Reviewing a submitted take-home solution involves the HM sifting through a codebase that is entirely AI generated and reveals little about the candidate’s thought process or problem solving ability. Live “vibe” assessment takes a whole hour of time from the interviewer (which was often the CTO) per candidate.

Moreover they are throwing away the most valuable piece of info: the claude code session log.

I built Gonfire, which consists of a proxy which records and analyzes a candidate’s claude code interactions while solving the assessment and displays a digestible report to a hiring manager. *I’ve refrained from deriving any quantitative metrics of performance until I feel confident that there is a solid basis for any such metric, so the analysis is primarily qualitative for now.

I took an assessment myself, you can view my results in the demo.

Live demo: https://app.gonfire.io (showhn@gonfire.io / Aa123123123123)

Relevant post from Anthropic: <https://www.anthropic.com/engineering/AI-resistant-technical...>

This could allow for some interesting directions in the future:

- “Anti-Spoiler” - Prevent LLMs from spoiling key problem insights/ideation

- Clustering candidates based on distinguishing features of their thinking process

The Wild Cyberwest

Show HN: Proper, a Rails-shaped Python web framework

Simulacra Levels and Their Interactions

KV Cache Is Becoming the Memory Hierarchy of Inference

Show HN: I made a printable graph papaer templates website

LocalLightChat – New AI Chat UI that handles 500k tokens on a 15 year old laptop

HuggingFace

Nerds, ninjas, and neutrons: The story of The Nuclear Emergency Support Team

The Kind of AI Adoption I Believe In

A few ways of specifying per-theme colours in only CSS

How to Share the AI Windfall

XS: A programming language. Anywhere, anytime, by anyone

Show HN: Freelang – a direct-to-assembly syscall lang with rad concurrency

Reddit Is Blocking Some Users from Accessing Its Website from Mobile Devices

The Programming Language for Agents

Three's a party: US, China, and now Russia are on the prowl in GEO

What Software Is Made Of

Show HN: Whisper Large V3 Turbo Stream API

EV charging station fire caused by remote technician, report finds

South Korea says it will pursue all options to avoid Samsung strike

Europe Just Unveiled a Serious Rival to SpaceX's Starship

AVX-512

Datacenters slurping up so much juice they boosted prices 75%

Google users fight for refunds as unauthorized API usage bills soar

Is Britain Ungovernable?

The filesystem is the API (with TigerFS)

Brown vs. Board of Education (May 17th, 1954)

PyPI packages are increasing rapidly

Physicists Can't Agree on What Quantum Mechanics Says about Reality

God Exists. Here Is How Things Work