Do you prefer GPT or Google Gemini?

2•AaronSwift1•5mo ago

and why

Comments

fbhabbed•5mo ago

GPT5-Thinking if I need a precise answer with the least possible amount of mistakes.

GPT5-Pro is the real deal.

Gemini if I need creative insights and a pleasant talk, but this comes at the cost of more mistakes (and it's hella stubborn).

Hopefully Gemini 3.0 will fix this.

Topfi•5mo ago

Generally I currently rely on GPT-5 Thinking (Medium Reasoning) for most tasks because of all the models currently available, the GPT-5 series (and GPT-4.1 before that) have been the most reliable in following instructions to the letter, doing no less, but more importantly, no more.

Both Claude models (from 3.5 Sonnet to 4.1 Opus) and Gemini 2.5 Pro have historically always taken a lot more liberties, which some users find appealing, but which I have come to not want when relying on a model for consistent output. I can see why some find great value in a model already implementing e.g. an auth provider when requesting the frontend for a login page, but for guiding a model, I personally prefer something to not happen if I didn't explicitly requested such behavior. Especially Claude as part of agentic coding workflows has a tendency to simply try e.g. a different package then what was requested, which some users may not notice. Found this very funny when Claude 4 Sonnet once fully reimplemented an infinite canvas as @xyflow failed to install properly. I'd rather a model error out there/ask for the user in the loop to confirm.

In regard to instruction following, while all three Frontier providers do well with their context windows, GPT-5 models are still a bit more preferable for me, despite having only 400k vs 1million, simply because what is there can not just be recalled, but will be adhered to reliably as well.

GPT-5 also seems a bit better regarding CSS, though I have far to limited UI taste to actually make a solid judgement on that front and styling is of course subjective.

Additionally, when benchmarking all three frontier models side by side, I have yet to find a coding task that GPT-5 cannot solve but the others can. I did however find certain cases where my initial instructions were lacking/not comprehensive enough, leading to all three having issues completing a task. In those cases, I found that Gemini 2.5 Pro when provided with the code base does best at rewriting an existing prompt. These then usually have far higher success rates when provided to GPT-5 Thinking (Medium Reasoning) or to a lesser extend when using one of the Claude 4 models. However, these Gemini provided prompts also occasionally contain inventions/hallucinations", so I must always triple check prompts when doing this.

For context, the main coding problem I am using model assistance for at the moment is some poorly designed Figma inspired real time syncing code with some overly odd edge cases, courtesy of my limited skillset.

For none-coding stuff, I have in the last semester mainly relied on Gemini 2.5 (sometimes Pro, often Flash) for creating nice summaries of lectures. I found any other models (doesn't matter whether OpenAIs previous models or anything from Anthropic, Mistral, Deepseek, Qwen, etc.) less suitable, mainly because these tended to output far to strong summaries, often truncating what is absolutely vital information. Gemini models are far more willing to actually output an extensive, maybe a bit to verbose summary, but I'd rather remove a few lines. I haven't yet gotten enough experience with GPT-5 as a summarization tool as the semester is only just starting, so cannot say how well OpenAIs newest series does there, but from very limited experience, GPT-5-mini has potential here to replace Gemini 2.5 Flash as my go to.

AI-powered text correction for macOS

AppSecMaster – Learn Application Security with hands on challenges

Fibonacci Number Certificates

AI Overviews are killing the web search, and there's nothing we can do about it

City skylines need an upgrade in the face of climate stress

1979: The Model World of Robert Symes [video]

Satellites Have a Lot of Room

1980s Farm Crisis

Show HN: FSID - Identifier for files and directories (like ISBN for Books)

Show HN: Holy Grail: Open-Source Autonomous Development Agent

Show HN: Minecraft Creeper meets 90s Tamagotchi

Show HN: Termiteam – Control center for multiple AI agent terminals

The only U.S. particle collider shuts down

Ask HN: Why do purchased B2B email lists still have such poor deliverability?

Show HN: Remotion directory (videos and prompts)

Portable C Compiler

Show HN: Kokki – A "Dual-Core" System Prompt to Reduce LLM Hallucinations

Software Engineering Transformation 2026

Microsoft purges Win11 printer drivers, devices on borrowed time

Lunch with the FT: Tarek Mansour

Old Mexico and her lost provinces (1883)

'AI' is a dick move, redux

The source code was the moat. But not anymore

Does anyone else feel like their inbox has become their job?

An AI model that can read and diagnose a brain MRI in seconds

Dev with 5 of experience switched to Rails, what should I be careful about?

AlphaFace: High Fidelity and Real-Time Face Swapper Robust to Facial Pose

Scientists discover “levitating” time crystals that you can hold in your hand

Rammstein – Deutschland (C64 Cover, Real SID, 8-bit – 2019) [video]

Tell HN: Yet Another Round of Zendesk Spam