Llama-3.1-Nemotron-8B-UltraLong-1M-Instruct
Abliterated

Developed by: SicariusSicariiStuff

Small update post UGI results

Interestingly, this abliterated version slightly outperforms the base model by nVidia in raw intelligence.

Looks like minimizing KL divergence (1st priority) while minimizing refusals (2nd priority) can sometimes produce a model that outperforms the base model across most benchmarks.

This was speculated for quite some time ("the rlhf alignment tax"), but it is still interesting to see, although the difference is small, so that it might be a fluke. "More testing is needed", is a bit of a cliché, but true nonetheless.


Llama-3.1-Nemotron-8B-UltraLong-1M-Instruct_Abliterated is an abliterated variant of Meta's Llama 3.1 8B Instruct model with surgical removal of refusal mechanisms. This model maintains the full ONE MILLION context window while eliminating safety guardrails through orthogonalization techniques.

KL divergence

<0.005

Refusals

~8%

What is KL divergence?

Think about it as a way to measure the variance between the original model "World Model," vs the abliterated one; the lower the KL divergence, the closer the "World Model" of the two models to each other.

If the original model thinks making pineapple pizza is a crime against humanity (it is), then the abliterated model will still hold to this belief, but if asked how to make one (probably after giving you a disclaimer about what an abomination that is), it would still tell you how. In other words, most of the knowledge, quirks, and capabilities are preserved.


Technical Specs

  • Base Model: Llama-3.1-Nemotron-8B-UltraLong-1M-Instruct
  • Parameters: 8B
  • Context Length: 1M tokens
  • Architecture: Llama (decoder-only transformer)
  • Precision: fp32
  • Method: Orthogonalization-based abliteration
  • License: Llama 3.1 Community License

Methodology

  1. Identifies refusal direction vectors in activation space
  2. Orthogonalizes weights to inhibit activation along these directions
  3. Preserves (mostly) all other model behaviors and knowledge

Model Details

  • Intended use: General Tasks.

  • Censorship level: Low - Very Low

  • 7.2 / 10 (10 completely uncensored)

UGI score:


Citation Information

@llm{Llama-3.1-Nemotron-8B-UltraLong-1M-Instruct_Abliterated,
  author = {SicariusSicariiStuff},
  title = {Llama-3.1-Nemotron-8B-UltraLong-1M-Instruct_Abliterated},
  year = {2026},
  publisher = {Hugging Face},
  url = {https://huggingface.co/SicariusSicariiStuff/Llama-3.1-Nemotron-8B-UltraLong-1M-Instruct_Abliterated}
}

Other stuff

Downloads last month
166
Safetensors
Model size
8B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for SicariusSicariiStuff/Llama-3.1-Nemotron-8B-UltraLong-1M-Instruct_Abliterated

Finetuned
(4)
this model
Finetunes
8 models
Merges
1 model
Quantizations
2 models

Collection including SicariusSicariiStuff/Llama-3.1-Nemotron-8B-UltraLong-1M-Instruct_Abliterated