Ask HN: What in your opinion is the best model for vibecoding? My thoughts below

1•adinhitlore•1h ago

So I've been vibecoding for like years but over the past 1-2 weeks it became an obsession to a point my eyes are literally red and inflamed right now since I can't stop it (slightly humours...i was feeling worse off yesterday, the redness is now gone).

anyways my takes:

1. The #1 place is VERY debatable for me, it's a toss between gpt 5 high, "claude thinking" both sonnet 4 and 4.1 opus and surprise,surprise: qwen 235b 'thinking' (the "hidden gem").

Their pros and cons:

gpt 5 high: Usually gives VERY long code so it'generous, no compute is saved, it's a bona fide model but it seems sometimes too aligned for my taste. For example: whenever i force it to design a novel text generation model, unless i am very speficic in my requirements it tries to dumb it down by making a pure n-gram model, which almost feels like an insult, basically saying "look we at openai are the best, here's a stupid markov chain for you to play with, but leave the big game to us". If however you phrase it more in detail and even if you show some pessimism it will not "echo back" the pessimism but rather try to convince you it can be done with some tweaks. The con: Usually it's just...not smart, this is easily seen when you go through the code and you see it had written code very specific to the example you gave, which is the number one symptom of bad programming, a variable/method should be as universal as possible, you don't need a template which only uploads ftp when you plan an upload via http and ftp, as a one example.

2. Claude: Initially i thought it's the best one and for pure coding it's "getting there" but for designing algorithms, gtpt 5 high and qwen 'thinking' outperform it with ideas. I'd say sonnet 4 32k is better for designing and opus for the actual coding, maybe depending on the task and programming language used it will perform differently. The good news is the actual code usually compiles with very few warnings and almost never errors, so it knows what its doing. Even gpt 5 high is worse and qwen will sometimes though rarely give you bad code that will produce an error be it in Python 3 or C/gcc.

Since i covered the 'good' here are the bad and the ugly:

Gemini, grok, amazon nova, whatever microsoft has: don't, just don't. Their shortcomings are so obvious to a point I'm convinced all the people who hype them online are either elon musk (for grok), bill gates (phi4 etc) or zuckerberg (llama). Their codes are very short so it's obvious they will not cover the features requested, compilation feels like 'quantum mechanics' 50/50 chance, the code is written in the worst way possible, and sometimes they even misinterpret entirely what your goal is. You may have some luck debugging with gemini 2.5 pro if you're patient, frankly even the gpt 4 on chatgpt.com version (not the "arena!") is bad for fixing errors but ok with the basic ones.

Another hidden gem: https://console.upstage.ai/playground/chat I'm not "shilling" for it, hard to believe i know, but i don't ignore it entirely because as an indie model i hope it's not too aligned so it may actually give you code that Yudkovsky and Yampolski consider "immediate risk to humanity, civilization and the galaxy".

My experience are 90% with C mostly a lot of Python too, little-to-no C# though back in the days vibecoding c# on gpt 4 sucked a lot.

My ultimate issue as of now is that while LLMs/transformers are great they still lack the innovation, human thought power to come up with original ideas, however they code way faster than human obviously and the code usually works with few warnings or errors - i think the focus towards 2030 should be the innovation power and complex designs of algorithms. Altman dreaming about "discovering new physics" seems a little bit ambitious given the current status quo. Again they're great and they help me a lot, looking forward to see their impact on larger scale on society!

Comments

reify•1h ago

The 1925 Ford Model T Touring Car is the best bet.

It has amazing brakes for a 1920's car.

The best thing in my experience is, it does not rely on fantasy ai to drive it. you can just turn the key and Vrooom, away you go.

My local mechanic is be particularly pleased with my purchase and recommendation.

He says, he can repair my car without resorting to repairing the damage the ai mechanic did a few days earlier. which, in the long run saves me an awful lot of money on car maintenance.

I dont have to pay two people to fix one job.

isnt it amazing what humans can do.

The scapegoats guide to organizational 'transformation'

Are we living in a black hole?

Show HN: Demo of AI-enabled voice/vision features on open source hardware [video]

[ARC-AGI-2 SoTA] Efficient Evolutionary Program Synthesis

Behind the Mirror: Inside the World of Big Brother

Forests store carbon wealth but credit systems undervalue their potential

I Gained 25 Pounds. Why Are People Acting Like I've Committed a Crime?

Show HN: Tailkits UI, 200 Tailwind components for landing pages

Old Page of Google Toolbar

Arkime: An open source, large scale, full packet capturing and indexing system

Scaling AI Evaluation Through Expertise

A Cross-Team Risk Map of In-House CIAM for B2B and B2C Apps

Official MCPS are at risk to Willison's lethal trifecta attack

What are truffles, and why are they so expensive? [video]

AMD ROCm 7.0 Released

Show HN: LLM Memory Notes – semantic memory layer for AI agents (MCP)

Windows Secure Boot certificates are expiring, here is everything you need know

US backpedals as Hyundai factory ICE raid enrages South Korea

Julia Neagu: Why evals haven't landed (yet) + lessons from evals at Copilot

From $0 to $40M ARR: Inside the tech that powers Bolt.new

Unit Test Isolation Using MVCC

Should AIs have a right to their ancestral humanity?

Mini Microscope for Real-Time Brain Imaging

Luanox – a modern, snappy module host for Lua

Comparing Git Mirror Options

Show HN: Archil's one-click infinite, S3-backed local disks now available

Orcas sink one boat, damage another, off coast of Portugal

Bitrig's Swift Interpreter: From Code to Bytecode

FileVault on macOS Tahoe Uses iCloud Keychain to Store Its Recovery Key

Your Unit Tests Suck