frontpage.

Last week I ran a small experiment while building a mid-sized Go backend (APIs + some concurrency-heavy logic + a bit of refactoring).

I tested:

- Gemini Pro 3 - Opus 4.6 - GLM-5 - Kimi 2.5

My rough criteria:

- Code correctness (first-pass compile success) - Quality of architectural suggestions - Refactor clarity - Handling of existing code context - Cost per useful output

Surprisingly (at least to me), Kimi 2.5 gave the best cost/performance ratio for this particular workload. It wasn’t always the most “verbose” or polished, but it required the fewest correction loops per dollar spent.

Opus 4.6 felt strong on reasoning-heavy changes, but cost scaled quickly. Gemini Pro 3 was decent but inconsistent in multi-file refactors. GLM-5 was interesting but sometimes hallucinated internal project structures.

This is obviously anecdotal and project-specific.

Curious:

What models are people here using for real-world codebases?

Has anyone benchmarked cost vs correction loops?

Are people optimizing for raw quality or iteration speed per dollar?

Would love to hear other dev experiences, especially from people working in Go or other statically typed backends.

OK, so Anthropic's AI built a C compiler. That don't impress me much

Show HN: Micropay – Stripe for Africa's biggest payment network

Programming is no longer the main skill of SWE

FFAB – Free GUI for ffmpeg

PopWheels helped a food cart ditch generators for e-bike batteries

Costs from Trump's tariffs paid mainly by US firms and consumers, NY Fed says

Docker Is Considered Harmful, but it doesn't have to be

5 Days, One GPU Gameboy Swarm

Shark filmed swimming in deep Antarctic waters for first time

Assistant to the Regional Manager

Show HN: CoChat MCP – Let your team review what your coding agent is building

Friday Links #34: Fresh JavaScript Tools and Releases

Making a Game for the Pokémon Mini [video]

Trying to Make an Automated Ecologist

Agent orchestration isn't just for coders

The strongest encryption is just noise

jonesforth

GitHub Agentic Workflows are now in technical preview

Cost-benefit analysis no longer applies, because cost is effectively zero

Bali's thieving monkeys can spot high-value items to ransom

Show HN: SatGate – Budget enforcement proxy for MCP tool calls (L402/macaroons)

Tuning in to College Radio Materials on World Radio Day 2026

Show HN: Mac apps are signed in. Why make an AI authenticate too?

10k+ Funny Quotes

My First Vulkan Extension

Zed editor switching graphics lib from blade to wgpu

Show HN: I built a macOS app that toggles between extended/mirrord monitor setup

Outperforming AWS Comprehend in a Weekend

Stop Typing, Start Talking

Meta Plans 'Name Tag' Facial Recognition for Ray-Ban Smart Glasses

Comparing Gemini Pro 3, Opus 4.6, GLM-5 and Kimi 2.5 in a mid-sized Go project