Instructions to use rednote-hilab/dots.mocr with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use rednote-hilab/dots.mocr with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="rednote-hilab/dots.mocr", trust_remote_code=True) messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("rednote-hilab/dots.mocr", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use rednote-hilab/dots.mocr with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "rednote-hilab/dots.mocr" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "rednote-hilab/dots.mocr", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/rednote-hilab/dots.mocr
- SGLang
How to use rednote-hilab/dots.mocr with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "rednote-hilab/dots.mocr" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "rednote-hilab/dots.mocr", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "rednote-hilab/dots.mocr" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "rednote-hilab/dots.mocr", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use rednote-hilab/dots.mocr with Docker Model Runner:
docker model run hf.co/rednote-hilab/dots.mocr
[DRAFT] fix: transformers 5.x compat (cache_position + kwargs naming)
Summary
This PR fixes two issues that prevent dots.ocr from working with transformers>=5.0:
1. cache_position TypeError on generation
In transformers 5.x, cache_position is no longer maintained in the generation loop. The current code does cache_position[0] == 0 which crashes with TypeError: 'NoneType' object is not subscriptable.
Fix: Use a combined check compatible with both transformers 4.x and 5.x — fall back to past_key_values is None when cache_position is unavailable.
2. _validate_model_kwargs ValueError for processor outputs
forward() uses **loss_kwargs instead of **kwargs. Transformers 5.x validation only recognizes **kwargs/**model_kwargs as catch-all params, causing processor outputs like mm_token_type_ids to fail validation.
Fix: Rename **loss_kwargs to **kwargs (functionally identical).
Backward compatibility
Both fixes maintain full backward compatibility with transformers 4.x.
Same issues as dots.ocr PR/50
This PR has the same two issues reported on the dots.ocr PR:
DotsVLProcessor.__init__missingvideo_processor— causes TypeError with transformers 4.57+ (note: dots.mocr may not trigger this if its config doesn't declare video tokens, but the code pattern is the same)Predictions differ under transformers 5.x — model produces garbage output (single fullpage bbox) instead of multi-element layout JSON
Current workaround: Running dots.mocr evaluation with transformers 4.57.6.
Marking as draft until resolved. See dots.ocr PR/50 for detailed description.