Shuibai12138
/

CDLM-0.5B

Text Generation

code-generation

diffusion-model

masked-diffusion

code-correction

Model card Files Files and versions

CDLM-0.5B / README.md

Shuibai12138's picture

Add model card (README.md)

15a2261 verified 3 months ago

|

history blame contribute delete

3.58 kB

	---
	license: mit
	base_model: fredzzp/open-dcoder-0.5B
	tags:
	- code-generation
	- diffusion-model
	- masked-diffusion
	- code-correction
	- python
	datasets:
	- code
	language:
	- code
	pipeline_tag: text-generation
	---

	# CDLM-0.5B

	## Model Description

	CDLM-0.5B is a fine-tuned version of [fredzzp/open-dcoder-0.5B](https://huggingface.co/fredzzp/open-dcoder-0.5B), trained using error-aware training with the mixture objective proposed in our paper on Corrective Diffusion Language Models. This model is designed to improve error-aware confidence and targeted refinement capabilities in code generation tasks.

	### Key Features

	- Base Model: [fredzzp/open-dcoder-0.5B](https://huggingface.co/fredzzp/open-dcoder-0.5B) (a masked diffusion language model based on Qwen2)
	- Training Method: Error-aware training with mixture objective that explicitly supervises visible incorrect tokens
	- Architecture: Masked Diffusion Language Model (MDLM)
	- Parameters: ~0.5B

	## Training Details

	This model was fine-tuned from `fredzzp/open-dcoder-0.5B` using error-aware training with a mixture objective. For detailed information on the training methodology, please refer to our paper: [Corrective Diffusion Language Models](https://arxiv.org/pdf/2512.15596).

	## Usage

	### Installation

	```bash
	pip install torch transformers
	```

	### Code Generation

	```python
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM

	model_id = "Shuibai12138/CDLM-0.5B"
	device = "cuda" if torch.cuda.is_available() else "cpu"

	# Load tokenizer and model
	tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	torch_dtype=torch.bfloat16,
	trust_remote_code=True
	).to(device)

	# Generate code
	prompt = "def fibonacci(n):"
	input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device)

	# Use diffusion generation
	outputs = model.diffusion_generate(
	inputs=input_ids,
	max_new_tokens=100,
	steps=16,
	temperature=0.8
	)

	prompt_len = input_ids.shape[1]
	generated_text = tokenizer.decode(outputs.sequences[0][prompt_len:], skip_special_tokens=True)

	print("Generated Code:")
	print(generated_text)
	```

	Note: This model uses a custom `diffusion_generate` method, so `trust_remote_code=True` is required when loading the model.

	### Iterative Refinement

	The model supports iterative refinement for code correction. See the [CDLM repository](https://github.com/zhangshuibai/CDLM) for examples of using the model for code correction tasks.

	## Citation

	If you use this model in your research, please cite:

	```bibtex
	@misc{zhang2025correctivediffusionlanguagemodels,
	title={Corrective Diffusion Language Models},
	author={Shuibai Zhang and Fred Zhangzhi Peng and Yiheng Zhang and Jin Pan and Grigorios G. Chrysos},
	year={2025},
	eprint={2512.15596},
	archivePrefix={arXiv},
	primaryClass={cs.LG},
	url={https://arxiv.org/abs/2512.15596},
	}
	```

	## Related Resources

	- Paper: [Corrective Diffusion Language Models](https://arxiv.org/pdf/2512.15596)
	- Code Repository: [zhangshuibai/CDLM](https://github.com/zhangshuibai/CDLM)
	- Collection: [HuggingFace Collection](https://huggingface.co/collections/Shuibai12138/cdlm)
	- Base Model: [fredzzp/open-dcoder-0.5B](https://huggingface.co/fredzzp/open-dcoder-0.5B)

	## License

	This model is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.

	## Contact

	For questions and issues, please contact:

	Shuibai Zhang <shuibai@cs.wisc.edu>