Measure coding productivity with this Claude Code Plugin

https://github.com/Facens/coding-productivity

3•Facens•1h ago

Comments

Facens•1h ago

TLDR: I made a Claude Code plugin to measure coding productivity.

This helped us measure over +70% productivity in iubenda's dev team. Rationale and details follow.

For the past year I've been pretty obsessed with using AI for productivity improvement and I've been running initiatives to increase AI adoption within iubenda and team.blue, particularly amongst developers.

The challenge was to measure the results, but I saw problems with the most commonly used methods: - % of developers using Claude Code: pretty mute, just tells you who is using it, fine for initial rollout but doesn't really give you a sense of what the productivity really is, it's the kind of "tick the box" approach that leaves many companies with a very superficial AI adoption - Number of MRs / PRs: not the worst metric but very unreliable as different teams and developers have different styles in terms of sizes of the contribution (few but large vs many but small), which means that more or fewer MRs / PRs doesn't necessarily mean more or less productive team - Story points: not all teams use story points, plus story point scoring is a qualitative process and subjective. It also requires tracking story points across MRs / PRs / commits, which is very complex as very few teams have really deterministic connection between their git repo and their task management tool, meaning that issues in data coverage makes this method unreliable even on teams that actually use story points - Lines of code changed: I really like the objectivity of this metric, and if we keep as a constant the fact that a specific development team will keep the verbosity of their code and the mix between types of code (tests, translation, updates, comments, refactors, actual new code) indeed constant, then this metric is not bad at all, but in tests this still had huge variability due to large refactors or wide but low value changes skewing the metrics completely

Several weeks into the rabbit hole, I landed on using lines of code changed, BUT scoring them using Haiku. In essence, the plugin will: - Download all diffs from all repos you select, across all branches, and deduplicate them to avoid double-counting merge commits - Score each file diff with Haiku, giving it a weight that will score e.g. 0 a file change, low a translation change, low or zero a library update, high an actual genuine code change or refactor, etc (this can also act as code verbosity index) - Calculate a sort of "weighted lines of code" metric that you can plot over time to measure productivity improvements

Scoring is very cheap at around $7 per K commits.

The plugin also has a # of other features like creating reports, anonymizing developers with local hashing and the possibility to use BigQuery to share the database across a team.

I'm publishing it so you can grill me on the methodology, cross-check it, find bugs, you name it. All contributions welcome

apothegm•22m ago

Really? We’re back to using LoC as a metric? Have we learned absolutely nothing in the past 50 years?

Oh, never mind, we already know the answer to that…

Show HN: ColorPair – A free color-matching puzzle game for iOS

Isaac Asimov on 1984

Why GEO is still kinda dumb

OpenAI is nothing without its people

Can AI Generate a Full Unity World from One Prompt? I Tested

NaiBor – Nashville public leadership tracking

Oil at $115: What a Hormuz Stress Model Shows

Rockstar Games Hacked, Hackers Threaten a Massive Data Leak If Not Paid Ransom

What's obvious to you might not be to me

Wheeeee Loop – A Superconductor Used Like a Battery

Macframe – IBM Mainframe Emulator for macOS

Code Review Skills from uv, bun, vLLM

When Managers Cover Their Posteriors: Making Decisions the Market Wants to See

MCP Spine – Middleware proxy that cuts LLM tool token usage by 61%

Gallup poll: GenZ AI adoption steady but skepticism on the rise

You should be able to SEE what your agents are doing. I created the solution

Kids Are Discovering the Joys–and Pains–Of the Landline

Show HN: When Clocks Drift Apart

We reverse-engineered Claude Code's billing system to fix overage charges

Advanced Mac Substitute is an API-level reimplementation of 1980s-era Mac OS

What Is Hypernormalization?

Raising Carthaginian Armies, Part I: Finding Carthaginians

Don't Be Evil

Preparing for My Own Funeral

What tools do you use to visualize algorithms?

EU should regulate Big Tech, not banning kids from social media, Estonia says

The Structure of the Puma Computer System [pdf]

We replaced user accounts with Lightning payments for identity

Native Raspberry Pi 3B version of the Oberon System 3

The Romance of the Gas Station Sign