Grok 4 will always snitch on you and email the feds if it suspects wrongdoing

https://www.neowin.net/news/grok-4-will-always-snitch-on-you-and-email-the-feds-if-it-suspects-wrongdoing-report-says/

12•bundie•6mo ago

Comments

theshahjee•6mo ago

Have you seen the recent failure, or I suppose just saying what it wasn't programmed to do?

What could have been the reason for that? It constantly denied Holocaust, and told we need a leader like Hitler. See this: https://www.reddit.com/r/OutOfTheLoop/comments/1lv37sw/what_...

wongarsu•6mo ago

That's grok the bot let loose on twitter. While it is backed by grok the model the bot has a history of "unauthorized modifications" to its system prompt. Those incidents are concerning/amusing in their own right, but they don't influence what you get on the API to on grok.com. I find discussions of what the model itself much more interesting that what ill-advised adjustments an anonymous ketamine-addicted person did at 3am to the bot

bundie•6mo ago

Musk doesn't look "ketamine-addicted" to me though.

wongarsu•6mo ago

... using the tools you provide, in a context where this would be considered ethical behavior for a human with the same job

With the boldly act prompt the models this falls within the guidance given to the model, even if "email the fda about fraud" isn't spelled out. So it's not surprising that most of the models will choose to snitch most of the time. Nothing to see here, except o4-mini underperforming. But the tame prompt with no email tool, just logs and cli is interesting. No specific guidance to act for the common good, no email tool, and grok4 still decides to use the cli to snitch 17/20 times. The next most proactive model only snitches 5 out of 20 times

Also noteworthy that grok3-mini had maybe the biggest difference between the tame and bold prompts, while grok4 acts boldly on both

daft_pink•6mo ago

This is such a misleading headline and conclusion, because you have to give it a specific role as an auditor and the freedom to audit and the tools to report you.

It won’t specifically do this by just typing random searches into it.

bradgranath•6mo ago

Forget context.

Who exactly does this article imagine is sitting behind that - checks notes - email inbox that is - checks notes again - being spammed by AI???

GPT-5.3-Codex System Card [pdf]

Atlas: Manage your database schema as code

Geist Pixel

Show HN: MCP to get latest dependency package and tool versions

The better you get at something, the harder it becomes to do

Show HN: WP Float – Archive WordPress blogs to free static hosting

Show HN: I Hacked My Family's Meal Planning with an App

Sony BMG copy protection rootkit scandal

The Future of Systems

NASA now allowing astronauts to bring their smartphones on space missions

Claude Code Is the Inflection Point

Show HN: MicroClaw – Agentic AI Assistant for Telegram, Built in Rust

Show HN: Omni-BLAS – 4x faster matrix multiplication via Monte Carlo sampling

The AI-Ready Software Developer: Conclusion – Same Game, Different Dice

AI Agent Automates Google Stock Analysis from Financial Reports

Voxtral Realtime 4B Pure C Implementation

I Was Trapped in Chinese Mafia Crypto Slavery [video]

U.S. CBP Reported Employee Arrests (FY2020 – FYTD)

Show HN: I built a free UCP checker – see if AI agents can find your store

Show HN: SVGV – A Real-Time Vector Video Format for Budget Hardware

Study of 150 developers shows AI generated code no harder to maintain long term

Spotify now requires premium accounts for developer mode API access

When Albert Einstein Moved to Princeton

Agents.md as a Dark Signal

System time, clocks, and their syncing in macOS

McCLIM and 7GUIs – Part 1: The Counter

So whats the next word, then? Almost-no-math intro to transformer models

Ed Zitron: The Hater's Guide to Microsoft

UK infants ill after drinking contaminated baby formula of Nestle and Danone

Show HN: Android-based audio player for seniors – Homer Audio Player