Gemma 3 – 1B IT GLM-4.7 Flash Heretic Uncensored Thinking

This repository hosts Gemma 3 – 1B IT GLM-4.7 Flash Heretic Uncensored Thinking, a lightweight 1 billion–parameter instruction-tuned model derived from Google’s Gemma 3 1B IT base.

This variant is optimized for fast inference, structured reasoning behavior, and minimal refusal patterns, while maintaining compatibility with Gemma’s native instruction format.


Model Overview

  • Model Name: Gemma 3 – 1B IT GLM-4.7 Flash Heretic Uncensored Thinking
  • Parameter Count: 1 Billion (1B)
  • Base Architecture: Gemma 3
  • Base Model: google/gemma-3-1b-it
  • Model Type: Instruction-Tuned Causal Language Model
  • Context Length: Inherits base model context window
  • Primary Language: English
  • License: Gemma License (inherits from base model)
  • Maintainer / Publisher: DavidAU

What Is This Model?

This model is a modified derivative of Gemma 3 – 1B IT, configured for:

  • Reduced refusal bias compared to default IT alignment
  • Enhanced direct-answer behavior
  • Stronger short-form reasoning output
  • Faster response latency due to compact parameter size
  • “Flash”-style concise and rapid generation

The “Heretic Uncensored Thinking” configuration emphasizes:

  • Minimal conversational filtering
  • Direct completion behavior
  • Structured reasoning patterns when prompted

No additional safety layers beyond those present in the base architecture are intentionally introduced.


Key Features & Capabilities

Core Strengths

  • Fast inference on consumer GPUs and CPUs
  • Low VRAM requirements
  • Instruction-following compatibility
  • Concise reasoning outputs
  • Suitable for lightweight agent pipelines

Performance Characteristics

  • Optimized for short-to-medium generation tasks
  • Responsive in real-time assistant applications
  • Works well in tool-driven or chain-of-thought–style prompts
  • Practical for edge deployments and experimentation

Intended Use Cases

  • Lightweight AI assistant
  • Prompt engineering experimentation
  • Tool-augmented agents
  • Rapid-response chat systems
  • Local inference environments
  • Educational or research workflows
  • Controlled “uncensored” deployment environments

Chat Template & Prompt Format

This model follows the Gemma instruction format.

For best results:

  • Provide explicit system instructions
  • Use structured reasoning prompts when needed
  • Avoid mixing non-Gemma chat formats

Hardware & Deployment Notes

Due to its 1B parameter size:

  • Runs efficiently on 8GB GPUs
  • Suitable for CPU inference with quantization
  • Ideal for edge devices and low-resource setups
  • Compatible with common inference engines supporting Gemma architecture

Quantized versions (GGUF, GPTQ, AWQ, etc.) may be used depending on deployment stack.


Alignment & Safety Notice

This is an “uncensored” derivative configuration.

  • Reduced refusal behavior compared to standard IT
  • Users are responsible for system prompt controls
  • Deployment should follow local laws and ethical guidelines
  • No additional alignment layers are added by this repository

Use responsibly.


License & Usage Notes

This model inherits the Gemma License from its base model (google/gemma-3-1b-it).

  • The Gemma License is a custom license provided by Google
  • You must review and comply with the Gemma License terms
  • This repository does not change or replace the original licensing terms

Users are responsible for ensuring compliance with all applicable regulations.


Acknowledgements

  • Google for the Gemma 3 architecture and base model
  • The Hugging Face ecosystem
  • Open-source tooling communities supporting lightweight deployment

Community & Support

  • Use the Hugging Face Discussions tab for issues and questions
  • Community experimentation and benchmarking feedback is welcome
Downloads last month
7,803
GGUF
Model size
1.0B params
Architecture
gemma3
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Andycurrent/Gemma-3-1B-it-GLM-4.7-Flash-Heretic-Uncensored-Thinking_GGUF

Quantized
(174)
this model

Collection including Andycurrent/Gemma-3-1B-it-GLM-4.7-Flash-Heretic-Uncensored-Thinking_GGUF