Key Modifications

  1. processing_opencua.py
    Implemented the OpenCUAProcessor class with interfaces fully aligned with the QwenVLProcessor (Qwen-VL processor) to ensure consistent usage patterns across similar multimodal models.
    Followed the same method signatures, input/output formats, and core logic as Qwen-VL processor for seamless integration.
  2. config.json
    Updated the auto-processor mapping to point to OpenCUAProcessor, allowing the AutoProcessor class to correctly load the OpenCUA processor without manual specification.
  3. modeling_opencua.py
    • FSDP Sharding Support: Set _base_model_prefix and no_split_module attributes to enable proper FSDP (Fully Sharded Data Parallel) sharding during training (previously missing these attributes caused FSDP sharding failures).
    • Property Additions: Added model, lm_head, and _support_sdpa properties to support logits calculation in downstream frameworks (e.g., evaluation pipelines, loss computation).
    • Forward Function Enhancement:
      • Adapted the forward method to comply with the latest transformers library interface standards.
      • Added handling logic for cases where attention_mask is None to prevent runtime errors.
        -Integrated placeholder_mask logic to support multimodal input masking requirements.
xywang626 changed pull request status to merged

@xywang626 It seems like you confused with 7B, 32B, 72B config.json. 7B and 72B config.json changed without changing safetensors.

@xywang626 It seems like you confused with 7B, 32B, 72B config.json. 7B and 72B config.json changed without changing safetensors.

Thank you for your kind remind! I mistakenly uploaded the wrong config.json. I will update config.json immediately!

XLang NLP Lab org

Thanks! Replaced files all merged.

Sign up or log in to comment