With the boldly act prompt the models this falls within the guidance given to the model, even if "email the fda about fraud" isn't spelled out. So it's not surprising that most of the models will choose to snitch most of the time. Nothing to see here, except o4-mini underperforming. But the tame prompt with no email tool, just logs and cli is interesting. No specific guidance to act for the common good, no email tool, and grok4 still decides to use the cli to snitch 17/20 times. The next most proactive model only snitches 5 out of 20 times
Also noteworthy that grok3-mini had maybe the biggest difference between the tame and bold prompts, while grok4 acts boldly on both
It won’t specifically do this by just typing random searches into it.
theshahjee•3h ago
What could have been the reason for that? It constantly denied Holocaust, and told we need a leader like Hitler. See this: https://www.reddit.com/r/OutOfTheLoop/comments/1lv37sw/what_...
wongarsu•3h ago
bundie•3h ago