For context, I have been using Opus 4.6 via Perplexity for this in the past few months and I think it was excellent, fair pushback/ counterarguments, reasonable suggestions and discussion. Now with the new Opus 4.7, I notice it is now much more verbose, more sycophantic, and quite often confidently making statements that are wrong and without evidence.
I think in performing actual coding tasks it is great if not even slightly better, but the gap in thinking and discussion is really felt. Previously I used GPT, Gemini, and Grok too but they dont feel as productive as my Opus 4.6 experience.
A few questions - Is Opus 4.7 still the best default model for this task? - Is this solvable via system prompts or alternative setup, or what's the correct way to to think about it? - More broadly speaking, model changes and updates every few months, so how actually can we "lock-in" a reasonable setup?