Small update post UGI results

Interestingly, this abliterated version slightly outperforms the base model by nVidia in raw intelligence.

Looks like minimizing KL divergence (1st priority) while minimizing refusals (2nd priority) can sometimes produce a model that outperforms the base model across most benchmarks.

This was speculated for quite some time ("the rlhf alignment tax"), but it is still interesting to see, although the difference is small, so that it might be a fluke. "More testing is needed", is a bit of a cliché, but true nonetheless.

Llama-3.1-Nemotron-8B-UltraLong-1M-Instruct_Abliterated is an abliterated variant of Meta's Llama 3.1 8B Instruct model with surgical removal of refusal mechanisms. This model maintains the full ONE MILLION context window while eliminating safety guardrails through orthogonalization techniques.

KL divergence

<0.005

Refusals

~8%

What is KL divergence?

Think about it as a way to measure the variance between the original model "World Model," vs the abliterated one; the lower the KL divergence, the closer the "World Model" of the two models to each other.

If the original model thinks making pineapple pizza is a crime against humanity (it is), then the abliterated model will still hold to this belief, but if asked how to make one (probably after giving you a disclaimer about what an abomination that is), it would still tell you how. In other words, most of the knowledge, quirks, and capabilities are preserved.

Technical Specs

Base Model: Llama-3.1-Nemotron-8B-UltraLong-1M-Instruct
Parameters: 8B
Context Length: 1M tokens
Architecture: Llama (decoder-only transformer)
Precision: fp32
Method: Orthogonalization-based abliteration
License: Llama 3.1 Community License

Methodology

Identifies refusal direction vectors in activation space
Orthogonalizes weights to inhibit activation along these directions
Preserves (mostly) all other model behaviors and knowledge

Model Details

Intended use: General Tasks.
Censorship level: Low - Very Low
7.2 / 10 (10 completely uncensored)

UGI score:

Citation Information

@llm{Llama-3.1-Nemotron-8B-UltraLong-1M-Instruct_Abliterated,
  author = {SicariusSicariiStuff},
  title = {Llama-3.1-Nemotron-8B-UltraLong-1M-Instruct_Abliterated},
  year = {2026},
  publisher = {Hugging Face},
  url = {https://huggingface.co/SicariusSicariiStuff/Llama-3.1-Nemotron-8B-UltraLong-1M-Instruct_Abliterated}
}