--- license: mit base_model: deepseek-ai/DeepSeek-Prover-V1.5-Base tags: - deepseek - prover - lean4 - theorem-proving - quantized - sinq - 4-bit library_name: transformers --- # DeepSeek-Prover-V1.5-Base SINQ 4-bit This is a 4-bit quantized version of [DeepSeek-Prover-V1.5-Base](https://huggingface.co/deepseek-ai/DeepSeek-Prover-V1.5-Base) using [SINQ (Sinkhorn-Normalized Quantization)](https://github.com/huawei-csl/SINQ). ## Model Details - **Base Model**: DeepSeek-Prover-V1.5-Base (7B parameters) - **Quantization Method**: SINQ 4-bit with 2D tiling - **Group Size**: 128 - **Model Size**: ~3.5 GB (75% reduction from ~14GB original) - **Memory Usage**: ~5 GB GPU memory at inference - **Quality**: Lower quality than original (word overlap ~40-50%) ## Quantization Configuration ```python BaseQuantizeConfig( nbits=4, group_size=128, method="sinq", tiling_mode="2D", axis=1 ) ``` ## Usage ### Installation ```bash pip install torch transformers pip install git+https://github.com/huawei-csl/SINQ.git ``` ### Loading the Model ```python import torch from sinq.patch_model import AutoSINQHFModel from transformers import AutoTokenizer # Load quantized model tokenizer = AutoTokenizer.from_pretrained("Minhdn/deepseek-prover-sinq-4bit") model = AutoSINQHFModel.from_quantized_safetensors( "Minhdn/deepseek-prover-sinq-4bit", device="cuda:0", compute_dtype=torch.bfloat16 ) # Generate prompt = "theorem add_comm (a b : Nat) : a + b = b + a := by" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=100) result = tokenizer.decode(outputs[0], skip_special_tokens=True) print(result) ``` ## Performance - **Inference Speed**: ~3-4 tokens/second (slower without gemlite support for 2D tiling) - **Memory Savings**: 75% compared to original model - **Quality**: Moderate - best for experimentation or resource-constrained environments ## Limitations - Lower quality compared to original model (word overlap ~40-50%) - Slower inference due to 2D tiling (gemlite only supports 1D) - May produce incorrect Lean4 proofs more frequently than original - Not recommended for production use where correctness is critical ## Recommendations For better quality at the cost of larger model size, consider: - **6-bit version**: [Minhdn/deepseek-prover-sinq-6bit](https://huggingface.co/Minhdn/deepseek-prover-sinq-6bit) (~90% quality, ~5GB) - **Original model**: [deepseek-ai/DeepSeek-Prover-V1.5-Base](https://huggingface.co/deepseek-ai/DeepSeek-Prover-V1.5-Base) (100% quality, ~14GB) ## Citation If you use this model, please cite both the original DeepSeek-Prover paper and SINQ: ```bibtex @article{deepseek2024prover, title={DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data}, author={DeepSeek-AI}, journal={arXiv preprint arXiv:2405.14333}, year={2024} } @article{sinq2024, title={SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLMs}, author={SINQ Authors}, journal={arXiv preprint arXiv:2509.22944}, year={2024} } ``` ## License This model inherits the MIT license from the original DeepSeek-Prover-V1.5-Base model.