Self-alignment of Large Video Language Models with Refined Regularized Preference Optimization
Paper
• 2504.12083 • Published
• 3
Clone the repository and navigate to the RRPO directory:
git clone https://github.com/pritamqu/RRPO
cd RRPO
conda create -n llava python=3.10 -y
conda activate llava
pip install -r llavavideo.txt
# base model
git clone git@hf.co:lmms-lab/LLaVA-Video-7B-Qwen2
# RRPO weights
git clone git@hf.co:pritamqu/LLaVA-Video-7B-Qwen2-RRPO-16f-LORA
conda activate llava
BASE_WEIGHTS="./LLaVA-Video-7B-Qwen2"
WEIGHTS_ROOT="./"
## using lora weights
python inference.py \
--base_model_name "llavavideo_qwen_7b" \
--model-path ${BASE_WEIGHTS} \
--model-path2 ${WEIGHTS_ROOT}"/LLaVA-Video-7B-Qwen2-RRPO-16f-LORA" \
--video_path "sample_video.mp4" \
--question "Describe this video." \
--model_max_length 1024
Base model
lmms-lab/llava-onevision-qwen2-7b-si