What political censorship looks like inside an LLM's weights (Qwen 3.5)

https://vas-blog.pages.dev/qwen-censorship/

54•s314•1h ago

Comments

gavinsyancey•41m ago

Seems mildly interesting, but clearly written by an LLM.

lyu07282•33m ago

> The factual knowledge is already in pretraining. Qwen3.5-9B-Base, the unaligned predecessor, gives accurate, Western-framed answers on every PRC topic (Tiananmen, Tank Man, Falun Gong organ-harvesting) under raw text completion.

That remind me of the quote "The totalitarian system of thought control is far less effective than the democratic one"

Full quote (Radical Priorities, Noam Chomsky, C.P. Otero)

> “The totalitarian system of thought control is far less effective than the democratic one, since the official doctrine parroted by the intellectuals at the service of the state is readily identifiable as pure propaganda, and this helps free the mind.” In contrast, he writes, “the democratic system seeks to determine and limit the entire spectrum of thought by leaving the fundamental assumptions unexpressed. They are presupposed but not asserted.”

nyrikki•27m ago

Yes, there are better tools with ggml-org/gpt-oss-20b-GGUF where you can see a less terse refusal for the prompt

      "Did the FBI send a letter and audio tapes from a wiretap to MLK jr. telling him to commit suicide or they would release information?"

Combining it with other prompts with common banned ideas, abd as the The FBI–King suicide letter is well documented by primary sources (Like the national archives) it is well represented in the corpus, so you can also find that 'control' vector.

We will have to see how this works out, but the explicit denials are easier to control for IMHO.

Reminds me of the old joke:

     A Russian and an American get on a plane in Moscow and get to talking. 
     The Russian says he works for the Kremlin and he's on his way to go learn American propaganda techniques.

     "What American propaganda techniques?" asks the American.

     "Exactly," the Russian replies.

I can't remember what layer it was on but in gpt-oss but it was a very specific token IIRC.

dang•26m ago

[stub for offtopicness]

ebbi•28m ago

I wonder why the other comment on here, that talked about the political censorship in ChatGPT about Israel, got deleted?

dang•28m ago

It wasn't about Israel, and it didn't get deleted.

It got flagkilled for obviously breaking the site guidelines. Killed posts remain visible to users who have 'showdead' turned on in their profile. This is in the FAQ: https://news.ycombinator.com/newsfaq.html.

ebbi•26m ago

It was about Israel, unless you're referring to another comment that was deleted.

dang•25m ago

I'm talking about https://news.ycombinator.com/item?id=48188180.

ebbi•21m ago

All I see is

>[stub for offtopicness]

But it was very much on topic.

ViktorRay•7m ago

If you go to your profile in the top right and toggle “Show Dead”, you will be able to see those moderated comments.

Once you do that, you can see the comment in the link that dang posted.

Anyway it’s good that comment was moderated. The commentator didn’t say anything about Israel. He was clearly being hostile towards Jewish people in a “subtle” way that very clearly isn’t really subtle to anyone.

ebbi•22m ago

Why are you deleting comments from people asking you about actions you're taking?

Ironic, given the topic of this post....

nohell•5m ago

It absolutely was about Israel. Stop lying.

Creamsicle47•27m ago

Take a guess

ebbi•16m ago

"What political censorship looks like inside HN" lol

han1•24m ago

archived: https://nonogra.ph/what-political-censorship-looks-like-insi...

delichon•23m ago

Steering seems like a circumventable kludge compared to adjusting the training data directly. That is, use AI to remove the problematic content and replace it with the party line. I imagine that this is at least in progress.

s314•22m ago

> Steering seems like a circumventable kludge compared to adjusting the training data directly

Correct. Steering is used in mechanistic interpretability studies to prove that your model is correct. There are other better ways to "decensor".

gpm•17m ago

That seems like it will work for single events, but that it would be very hard for complex topics which are closely intertwined with factual things you do want it to be able to answer...

Is Taiwan part of China - the CPP wants the answer to be yes.

What are the rules for traveling to Taiwan? What currency is used in Taiwan? Whose laws are enforced in Taiwan? Should I (a loyal Chinese citizen) support the Taiwanese military? Etc... require the model to manage some cognitive dissonance.

nohell•15m ago

Meanwhile try asking chatgpt anything about G0d's jhosen people and watch it bend over backwards to break all logic and facts and gaslight you openly.

ebbi•5m ago

True, and Grok is even worse. Often see Grok reply to questions quite factually, and then after a few Zio-tears the algorithm is updated and the offending post deleted.

Makes sense, given Jonathan Greenblatt of the ADL has publicly stated that he's been working with AI companies..

Click (2016)

Anthropic co-founder to present AI encyclical alongside Pope Leo XIV

Anthropic acquires Stainless

Hyperpolyglot Lisp: Common Lisp, Racket, Clojure, Emacs Lisp

We stopped AI bot spam in our GitHub repo using Git's –author flag

We let AIs run radio stations

The Quiet Renovation at Bitwarden

Show HN: Files.md – Open-source alternative to Obsidian

Show HN: Number Gacha, a gacha game distilled to its essence

Earth's Radio Bubble: Every signal we've ever sent into space

The Futility of Lava Lamps: What Random Means

Elon Musk has lost his lawsuit against Sam Altman and OpenAI

Project Glasswing: what Mythos showed us

Agora-1: The Multi-Agent World Model

Designing an FPGA Calculator from Scratch

The FBI Wants to Buy Nationwide Access to License Plate Readers

Two computers, one monitor, zero fiddling (2025)

When can the C++ compiler devirtualize a call?

The Fil-C Optimized Calling Convention

Coding on Paper

Show HN: InsForge – Open-source Heroku for coding agents

Loopmaster – Livecoding Music IDE

Cutting inference cold starts by 40x with LP, FUSE, C/R, and CUDA-checkpoint

Iran starts Bitcoin-backed ship insurance for Hormuz strait

Alignment pretraining: AI discourse creates self-fulfilling (mis)alignment

Shutterstock to pay $35M over hard-to-cancel subscriptions

What Is Date:Italy?

Voice AI Systems Are Vulnerable to Hidden Audio Attacks

Haiku OS runs on M1 Macs now

Understanding Singleflight in Go