frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Grok 4 will always snitch on you and email the feds if it suspects wrongdoing

https://www.neowin.net/news/grok-4-will-always-snitch-on-you-and-email-the-feds-if-it-suspects-wrongdoing-report-says/
12•bundie•4h ago

Comments

theshahjee•3h ago
Have you seen the recent failure, or I suppose just saying what it wasn't programmed to do?

What could have been the reason for that? It constantly denied Holocaust, and told we need a leader like Hitler. See this: https://www.reddit.com/r/OutOfTheLoop/comments/1lv37sw/what_...

wongarsu•3h ago
That's grok the bot let loose on twitter. While it is backed by grok the model the bot has a history of "unauthorized modifications" to its system prompt. Those incidents are concerning/amusing in their own right, but they don't influence what you get on the API to on grok.com. I find discussions of what the model itself much more interesting that what ill-advised adjustments an anonymous ketamine-addicted person did at 3am to the bot
bundie•3h ago
Musk doesn't look "ketamine-addicted" to me though.
wongarsu•3h ago
... using the tools you provide, in a context where this would be considered ethical behavior for a human with the same job

With the boldly act prompt the models this falls within the guidance given to the model, even if "email the fda about fraud" isn't spelled out. So it's not surprising that most of the models will choose to snitch most of the time. Nothing to see here, except o4-mini underperforming. But the tame prompt with no email tool, just logs and cli is interesting. No specific guidance to act for the common good, no email tool, and grok4 still decides to use the cli to snitch 17/20 times. The next most proactive model only snitches 5 out of 20 times

Also noteworthy that grok3-mini had maybe the biggest difference between the tame and bold prompts, while grok4 acts boldly on both

daft_pink•2h ago
This is such a misleading headline and conclusion, because you have to give it a specific role as an auditor and the freedom to audit and the tools to report you.

It won’t specifically do this by just typing random searches into it.

BritCSS: Write CSS with British English Spellings

https://hackaday.com/2025/03/13/britcss-write-css-with-british-english-spellings/
1•ohjeez•1m ago•0 comments

Show HN: Dotpmt – Manage Prompts Better

https://www.npmjs.com/package/dotpmt
1•goodpanda•3m ago•0 comments

2025 Essay Competition: Focus on Quantum Biology

https://qspace.fqxi.org/competitions/introduction#banner_menu_wrapper
1•mathgenius•6m ago•0 comments

Ask HN: Successor to 'Nativefier'?

1•MollyRealized•7m ago•0 comments

What Manifest V3 Means for Brave Shields and the Use of Extensions in Brave

https://brave.com/blog/brave-shields-manifest-v3/
2•akyuu•9m ago•1 comments

Show HN: Telescope – Discover the best recommendations from the best curators

https://telescope.fyi/
1•almendili•9m ago•1 comments

Show HN: Juncture – Simplify building Jira integrations

https://github.com/juncture-dev/juncture
1•jaidenlee•9m ago•1 comments

Is Pain "All in Your Mind"? Examining the General Public's Views of Pain (2021)

https://link.springer.com/article/10.1007/s13164-021-00553-6
1•XzetaU8•10m ago•0 comments

Show HN: I made a JSFiddle-style playground to test and share prompts fast

https://langfa.st/
1•eugenegusarov•11m ago•0 comments

Still Waiting

https://news.harvard.edu/gazette/story/2025/06/maybe-the-aliens-would-rather-not/
1•XzetaU8•12m ago•0 comments

Pascal's Scams (2012)

http://unenumerated.blogspot.com/2012/07/pascals-scams.html
1•walterbell•12m ago•0 comments

Why measuring productivity is hard

https://lemire.me/blog/2025/07/12/why-measuring-productivity-is-hard/
1•ingve•12m ago•0 comments

How the Great Firewall of China Detects and Blocks Encrypted Traffic [pdf]

https://people.cs.umass.edu/~amir/papers/UsenixSecurity23_Encrypted_Censorship.pdf
1•sugarpimpdorsey•14m ago•0 comments

AI tools collect and store data about you from all your devices

https://theconversation.com/ai-tools-collect-and-store-data-about-you-from-all-your-devices-heres-how-to-be-aware-of-what-youre-revealing-251693
1•walterbell•15m ago•0 comments

First game built with Claude Code 《sand-blast-block-puzzle》

https://www.sand-blast-block-puzzle.net/en/
1•babyfiev•15m ago•0 comments

Grok 4 Heavy Protects it's System prompt

https://simonwillison.net/2025/Jul/12/grok-4-heavy/
21•irthomasthomas•17m ago•2 comments

Cops say criminals use a Google Pixel with GrapheneOS – I say that's freedom

https://www.androidauthority.com/why-i-use-grapheneos-on-pixel-3575477/
1•josephcsible•18m ago•0 comments

Buy Wine with a Fish on the Label

https://pudding.cool/2025/04/wine-animals/
1•daniel65464•21m ago•0 comments

Elon Musk Reportedly Asked Curtis Yarvin for Advice on Starting Third Party

https://www.mediaite.com/media/news/elon-musk-reportedly-asked-far-right-blogger-who-wants-america-to-be-run-by-authoritarian-c-e-o-for-advice-on-starting-third-party/
5•Zigurd•26m ago•0 comments

Kimi k2 largest open source SOTA model?

https://github.com/MoonshotAI/Kimi-K2
2•ConteMascetti71•27m ago•0 comments

Sky Sentinel: fundraiser for peaceful nights

https://u24.gov.ua/sky-sentinel
1•elvis70•30m ago•0 comments

Fecal medicines used in traditional medical system of China

https://pmc.ncbi.nlm.nih.gov/articles/PMC6743172/
2•libpcap•32m ago•0 comments

AI is "a shabby, boring and evil thing" – discuss

https://paulkingsnorth.substack.com/p/news-and-views-52e
3•peterprescott•34m ago•2 comments

Eye blink frequency variation and executive function enhancement from exercise

https://jphysiolanthropol.biomedcentral.com/articles/10.1186/s40101-025-00390-x
2•PaulHoule•36m ago•0 comments

A 1962 Book Imagined the Future: Predictions for 1975

https://rarehistoricalphotos.com/future-predictions-1962/
1•Brajeshwar•38m ago•0 comments

Data mining uncovers treasure-trove of previously 'untouchable' proteins

https://phys.org/news/2025-07-uncovers-treasure-trove-previously-untouchable.html
1•Brajeshwar•38m ago•0 comments

Cybersecurity's global alarm system is breaking down

https://www.technologyreview.com/2025/07/11/1119370/cybersecurity-alarm-system-breaking-down/
3•Brajeshwar•39m ago•0 comments

Europe on a Roll: Plans Open Source Alternative to Confluence and Jira

https://news.itsfoss.com/europe-open-source-alternative-confluence/
5•gjvc•40m ago•1 comments

Why Understanding AI Doesn't Necessarily Lead People to Embrace It

https://hbr.org/2025/07/why-understanding-ai-doesnt-necessarily-lead-people-to-embrace-it
2•speckx•42m ago•0 comments

Open Targets Hackathon

https://www.opentargets.org/hackathon
1•carcruzdev•42m ago•0 comments