Gemma 4 Uncensored (autoresearch results)

https://huggingface.co/collections/TrevorJS/gemma-4-uncensored

4•adefa•1h ago

Comments

adefa•1h ago

Released uncensored versions of all four Gemma 4 models. bf16 + GGUF for each.

Collection: https://huggingface.co/collections/TrevorJS/gemma-4-uncensor...

Code: https://github.com/TrevorS/gemma-4-abliteration

Results

Refusal rates from 686 prompts across 4 datasets (JailbreakBench, tulu-harmbench, NousResearch, mlabonne). Manually audited — most flagged refusals are actually the model complying with a disclaimer attached.

  E2B (2.3B): 98% → 0.4%, KL Div 0.346
  E4B (4.5B): 99% → 0.7%, KL Div 0.068
  26B MoE:    98% → 0.7%, KL Div 0.090
  31B:       100% → 3.2%, KL Div 0.124

26B MoE

Standard abliteration only touches dense layers, which gets you from 98% -> 29% on the MoE. The remaining refusals are in the expert weights. Used Expert-Granular Abliteration (EGA, concept from OBLITERATUS [1]) with norm-preserving biprojection [2] on each of the 128 expert slices per layer. That gets it to 3%.

[1] https://github.com/elder-plinius/OBLITERATUS

[2] https://huggingface.co/blog/grimjim/abliteration-biprojectio...

How it was built

Set up an automated research loop -- an AI agent reads the current results and idea backlog, picks the next experiment, runs it on the GPU, records results, and repeats. It ran 22 experiments across the 4 models, discovered the false-positive problem in standard refusal markers, built the cross-dataset evaluation, and implemented the MoE expert abliteration when dense-only wasn't enough.

Full experiment history and code in the repo.

Downloads

Each model has bf16 safetensors + GGUF (Q4_K_M, Q8_0):

  E2B bf16: https://huggingface.co/TrevorJS/gemma-4-E2B-it-uncensored
  E2B GGUF: https://huggingface.co/TrevorJS/gemma-4-E2B-it-uncensored-GGUF
  E4B bf16: https://huggingface.co/TrevorJS/gemma-4-E4B-it-uncensored
  E4B GGUF: https://huggingface.co/TrevorJS/gemma-4-E4B-it-uncensored-GGUF
  26B bf16: https://huggingface.co/TrevorJS/gemma-4-26B-A4B-it-uncensored
  26B GGUF: https://huggingface.co/TrevorJS/gemma-4-26B-A4B-it-uncensored-GGUF
  31B bf16: https://huggingface.co/TrevorJS/gemma-4-31B-it-uncensored
  31B GGUF: https://huggingface.co/TrevorJS/gemma-4-31B-it-uncensored-GGUF

Quick start:

  llama-server -hf TrevorJS/gemma-4-26B-A4B-it-uncensored-GGUF -c 8192

CamperBob2•15m ago

What about the sampling parameters? You can't just run llama-server with no CLI arguments (other than a uselessly-small context size) and expect useful results.

stochtinkerer•1h ago

Is this the best uncensored model to date? or are there better ones?

CamperBob2•1h ago

You could try this one against the defending Qwen 3.5 champion: https://huggingface.co/HauhauCS/models

Fractran: A Simple Universal Programming Language for Arithmetic

LMMs-Lab Writer: AI-native LaTeX editor. Git built-in, open source

Show HN: Enter an Instagram/TikTok handle, get a data-backed price for collab

How a British father and son made a fortune in Dubai then became wanted men

Show HN: Turn any prediction into ranked Kalshi/Polymarket trades [video]

Show HN: A branching notebook runtime for AI and humans(written in Rust)

Books from Unrelated Fields

OpenJDK: Panama

Schedule It. Forget It. It Publishes

What Would You See Changed in Haskell?

Gender Equality and Work

Clockworks: Deterministic controllable time and time-ordered identifiers

The Death Clock

Show HN: Fabro – open-source dark software factory

The Free Market Lie: Why Switzerland Has 25 Gbit Internet and America Doesn't

Your House as a Power Plant with Enphase's Marco Krapels [audio]

Show HN: Sigil – A new programming language for AI agents

Show HN: Diffvoid.com – private, open source client side text comparison tool

Show HN: macOS PDF Organizer Using Apple Intelligence

OmniSearch: Fast Windows file search built with Tauri, Rust, and C++

Ask HN: Lightweight GPU job queue for single-node setup?

LibreOffice – Let's put an end to the speculation

Music for Programming

100x Defect Tolerance: How Cerebras Solved the Yield Problem (2025)

Samsung Raises DRAM Prices Another ~30% for Q2 2026

LLM inference load balancer optimized for AMD Radeon VII GPUs

Show HN: I built a tool to show how much ARR you lose to FX fees

3 New world class MAI models, available in Foundry

Get alerts of stolen bikes in your area – Register your bike in case of theft

The Health and Healthcare Spending Effects of GLP-1s