| --- |
| license: mit |
| base_model: fredzzp/open-dcoder-0.5B |
| tags: |
| - code-generation |
| - diffusion-model |
| - masked-diffusion |
| - code-correction |
| - python |
| datasets: |
| - code |
| language: |
| - code |
| pipeline_tag: text-generation |
| --- |
| |
| # CDLM-0.5B |
|
|
| ## Model Description |
|
|
| **CDLM-0.5B** is a fine-tuned version of [fredzzp/open-dcoder-0.5B](https://huggingface.co/fredzzp/open-dcoder-0.5B), trained using **error-aware training** with the mixture objective proposed in our paper on Corrective Diffusion Language Models. This model is designed to improve error-aware confidence and targeted refinement capabilities in code generation tasks. |
|
|
| ### Key Features |
|
|
| - **Base Model**: [fredzzp/open-dcoder-0.5B](https://huggingface.co/fredzzp/open-dcoder-0.5B) (a masked diffusion language model based on Qwen2) |
| - **Training Method**: Error-aware training with mixture objective that explicitly supervises visible incorrect tokens |
| - **Architecture**: Masked Diffusion Language Model (MDLM) |
| - **Parameters**: ~0.5B |
|
|
| ## Training Details |
|
|
| This model was fine-tuned from `fredzzp/open-dcoder-0.5B` using error-aware training with a mixture objective. For detailed information on the training methodology, please refer to our paper: [Corrective Diffusion Language Models](https://arxiv.org/pdf/2512.15596). |
|
|
| ## Usage |
|
|
| ### Installation |
|
|
| ```bash |
| pip install torch transformers |
| ``` |
|
|
| ### Code Generation |
|
|
| ```python |
| import torch |
| from transformers import AutoTokenizer, AutoModelForCausalLM |
| |
| model_id = "Shuibai12138/CDLM-0.5B" |
| device = "cuda" if torch.cuda.is_available() else "cpu" |
| |
| # Load tokenizer and model |
| tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) |
| model = AutoModelForCausalLM.from_pretrained( |
| model_id, |
| torch_dtype=torch.bfloat16, |
| trust_remote_code=True |
| ).to(device) |
| |
| # Generate code |
| prompt = "def fibonacci(n):" |
| input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device) |
| |
| # Use diffusion generation |
| outputs = model.diffusion_generate( |
| inputs=input_ids, |
| max_new_tokens=100, |
| steps=16, |
| temperature=0.8 |
| ) |
| |
| prompt_len = input_ids.shape[1] |
| generated_text = tokenizer.decode(outputs.sequences[0][prompt_len:], skip_special_tokens=True) |
| |
| print("Generated Code:") |
| print(generated_text) |
| ``` |
|
|
| **Note**: This model uses a custom `diffusion_generate` method, so `trust_remote_code=True` is required when loading the model. |
|
|
| ### Iterative Refinement |
|
|
| The model supports iterative refinement for code correction. See the [CDLM repository](https://github.com/zhangshuibai/CDLM) for examples of using the model for code correction tasks. |
|
|
| ## Citation |
|
|
| If you use this model in your research, please cite: |
|
|
| ```bibtex |
| @misc{zhang2025correctivediffusionlanguagemodels, |
| title={Corrective Diffusion Language Models}, |
| author={Shuibai Zhang and Fred Zhangzhi Peng and Yiheng Zhang and Jin Pan and Grigorios G. Chrysos}, |
| year={2025}, |
| eprint={2512.15596}, |
| archivePrefix={arXiv}, |
| primaryClass={cs.LG}, |
| url={https://arxiv.org/abs/2512.15596}, |
| } |
| ``` |
|
|
| ## Related Resources |
|
|
| - **Paper**: [Corrective Diffusion Language Models](https://arxiv.org/pdf/2512.15596) |
| - **Code Repository**: [zhangshuibai/CDLM](https://github.com/zhangshuibai/CDLM) |
| - **Collection**: [HuggingFace Collection](https://huggingface.co/collections/Shuibai12138/cdlm) |
| - **Base Model**: [fredzzp/open-dcoder-0.5B](https://huggingface.co/fredzzp/open-dcoder-0.5B) |
|
|
| ## License |
|
|
| This model is licensed under the MIT License. See the [LICENSE](LICENSE) file for details. |
|
|
| ## Contact |
|
|
| For questions and issues, please contact: |
|
|
| **Shuibai Zhang** <shuibai@cs.wisc.edu> |
|
|
|
|