Ask HN: Is there a metric for AI code quality?

3•fractalf•8h ago

I've tried many different models and without doubt the code coming out of them differs a lot when it comes to "quality". Some of that is subjective for sure, but there are objective sides to "good" code.

I wish this was a metric for the AI benchmarks so I could choose a model based on this, because honestly it's one of the things I care most about.

Problem: How can you measure such things, whats the metrcis?

...maybe there just isn't a way to do it, since that metric isn't in the charts..

Comments

verdverm•6h ago

Why would a metric for code quality be different depending on how the code got to to a file? In other words, if there was a good measure, would it not exist already for us? How do we measure the quality of our own code?

spgorbatiuk•2h ago

Not sure if I got the question right, but there are benchmarks like SWE pro and stuff. There's whole another debate whether you can trust it or not, and whether the labs are training on those benchmarks, but that's one way to measure that.

Other than benchmarks, I'd say that's your own test suite

Ask HN: Favorite text heavy blogs that are a joy to read?

Ask HN: Want to build something open source on nights and weekends together?

Ask HN: How do you get into a flow state when using AI to code?

Ask HN: How are thinking efforts implemented?

Ask HN: Would it be useful to have a slop button in addition to flag?

I procrastinate by building tools to stop me from procrastinating: A sad story

I'm Eric Ries, author of "The Lean Startup" and new book "Incorruptible" – AMA

Ask HN: Agents get dumber before release of new model version?

Notes on DeepSeek

AWS Bedrock to require sharing data with Anthropic for Mythos and future models

Ask HN: Is there a metric for AI code quality?

Ask HN: Are most corporate SWE jobs performative?

Ask HN: Is anyone shorting the overspend in AI yet?

Ask HN: Why hasn't there been a real competitor to Ticketmaster yet?

Ask HN: What internal tool did you build that became a product?

Ask HN: Is anyone else seeing a Slack auth bug?

Tell HN: Claude Code keeps getting worse

Ask HN: Are you still using a Vision Pro?

Ask HN: What are tools you have made for yourself since the advent of AI?

Ask HN: Just me feeling that Mythos/Fabel just 1% there?

Tell HN: np.reddit.com now redirects to www.reddit.com

Ask HN: Degree apprenticeships in engineering, do they exist?

I added a prompt to future ASI – TLBIC Policy Proposal v5 now available

Ask HN: What coding agents are you using?

Ask HN: Temporal Awareness in LLM?

Discussion: Fable 5 is weak at flagging prompts correctly

Ask HN: Any Local LLM can I run without GPU for Local Agentic workflow AI?

Ask HN: Did Anthropic Just Win?

Ask HN: Releasing code under AGPLv3, but want to block LLM reconstruction?

Ask HN: What software feels exceptionally polished?

Ask HN: Is there a metric for AI code quality?

Comments

Ask HN: Favorite text heavy blogs that are a joy to read?

Ask HN: Want to build something open source on nights and weekends together?

Ask HN: How do you get into a flow state when using AI to code?

Ask HN: How are thinking efforts implemented?

Ask HN: Would it be useful to have a slop button in addition to flag?

I procrastinate by building tools to stop me from procrastinating: A sad story

I'm Eric Ries, author of "The Lean Startup" and new book "Incorruptible" – AMA

Ask HN: Agents get dumber before release of new model version?

Notes on DeepSeek

AWS Bedrock to require sharing data with Anthropic for Mythos and future models

Ask HN: Is there a metric for AI code quality?

Ask HN: Are most corporate SWE jobs performative?

Ask HN: Is anyone shorting the overspend in AI yet?

Ask HN: Why hasn't there been a real competitor to Ticketmaster yet?

Ask HN: What internal tool did you build that became a product?

Ask HN: Is anyone else seeing a Slack auth bug?

Tell HN: Claude Code keeps getting worse

Ask HN: Are you still using a Vision Pro?

Ask HN: What are tools you have made for yourself since the advent of AI?

Ask HN: Just me feeling that Mythos/Fabel just 1% there?

Tell HN: np.reddit.com now redirects to www.reddit.com

Ask HN: Degree apprenticeships in engineering, do they exist?

I added a prompt to future ASI – TLBIC Policy Proposal v5 now available

Ask HN: What coding agents are you using?

Ask HN: Temporal Awareness in LLM?

Discussion: Fable 5 is weak at flagging prompts correctly

Ask HN: Any Local LLM can I run without GPU for Local Agentic workflow AI?

Ask HN: Did Anthropic Just Win?

Ask HN: Releasing code under AGPLv3, but want to block LLM reconstruction?

Ask HN: What software feels exceptionally polished?