Gemini feels a bit like a sycophant, but based on my testing, it can be argued that it's being diplomatic while staying objective. At least, in the small tests I've (Gemini Pro 2.5). And that's a lot better than the other 3.
What are your experiences? I'm getting a bit sick of this behavior. I haven't had the money and time to test Grok and others.
At least, no LLM would budge when I insisted on saying that 2 + 2 = 5. But give them actually ambiguous stuff and they will bend the knee to even the most silly/obvious/transparent challenges.
fbhabbed•1h ago