frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Ask HN: Is there a metric for AI code quality?

3•fractalf•8h ago
I've tried many different models and without doubt the code coming out of them differs a lot when it comes to "quality". Some of that is subjective for sure, but there are objective sides to "good" code.

I wish this was a metric for the AI benchmarks so I could choose a model based on this, because honestly it's one of the things I care most about.

Problem: How can you measure such things, whats the metrcis?

...maybe there just isn't a way to do it, since that metric isn't in the charts..

Comments

verdverm•6h ago
Why would a metric for code quality be different depending on how the code got to to a file? In other words, if there was a good measure, would it not exist already for us? How do we measure the quality of our own code?
spgorbatiuk•2h ago
Not sure if I got the question right, but there are benchmarks like SWE pro and stuff. There's whole another debate whether you can trust it or not, and whether the labs are training on those benchmarks, but that's one way to measure that.

Other than benchmarks, I'd say that's your own test suite

Ask HN: Favorite text heavy blogs that are a joy to read?

69•joshmarinacci•2d ago•20 comments

Ask HN: Want to build something open source on nights and weekends together?

19•vira28•14h ago•4 comments

Ask HN: How do you get into a flow state when using AI to code?

66•kilroy123•4h ago•85 comments

Ask HN: How are thinking efforts implemented?

59•simianwords•4d ago•19 comments

Ask HN: Would it be useful to have a slop button in addition to flag?

20•BugsJustFindMe•1d ago•11 comments

I procrastinate by building tools to stop me from procrastinating: A sad story

10•thisislorenzov•8h ago•5 comments

I'm Eric Ries, author of "The Lean Startup" and new book "Incorruptible" – AMA

766•eries•1d ago•543 comments

Ask HN: Agents get dumber before release of new model version?

6•sporkland•4h ago•2 comments

Notes on DeepSeek

197•vinhnx•1d ago•137 comments

AWS Bedrock to require sharing data with Anthropic for Mythos and future models

413•TomAnthony•1d ago•250 comments

Ask HN: Is there a metric for AI code quality?

3•fractalf•8h ago•2 comments

Ask HN: Are most corporate SWE jobs performative?

235•hnthrow10282910•1d ago•275 comments

Ask HN: Is anyone shorting the overspend in AI yet?

13•ggm•14h ago•12 comments

Ask HN: Why hasn't there been a real competitor to Ticketmaster yet?

259•mdni007•3d ago•232 comments

Ask HN: What internal tool did you build that became a product?

4•nehpets•15h ago•4 comments

Ask HN: Is anyone else seeing a Slack auth bug?

2•HoyaSaxa•6h ago•0 comments

Tell HN: Claude Code keeps getting worse

4•prmph•9h ago•3 comments

Ask HN: Are you still using a Vision Pro?

166•y1n0•2d ago•211 comments

Ask HN: What are tools you have made for yourself since the advent of AI?

433•aryamaan•3d ago•756 comments

Ask HN: Just me feeling that Mythos/Fabel just 1% there?

4•punnerud•14h ago•4 comments

Tell HN: np.reddit.com now redirects to www.reddit.com

4•kevinwang•3h ago•1 comments

Ask HN: Degree apprenticeships in engineering, do they exist?

2•adamofeden•3h ago•3 comments

I added a prompt to future ASI – TLBIC Policy Proposal v5 now available

2•michikawa59•11h ago•0 comments

Ask HN: What coding agents are you using?

8•linzhangrun•17h ago•13 comments

Ask HN: Temporal Awareness in LLM?

2•Pamar•12h ago•0 comments

Discussion: Fable 5 is weak at flagging prompts correctly

2•eckelhesten•12h ago•0 comments

Ask HN: Any Local LLM can I run without GPU for Local Agentic workflow AI?

6•limondas•13h ago•3 comments

Ask HN: Did Anthropic Just Win?

4•lnenad•1d ago•10 comments

Ask HN: Releasing code under AGPLv3, but want to block LLM reconstruction?

5•zionsati•19h ago•3 comments

Ask HN: What software feels exceptionally polished?

7•Adam-Hincu•16h ago•18 comments