Is this a posing lora or only manga characters?

by krigeta - opened Jan 5

Jan 5

Hello, I am also curating a dataset for a posing lora and it seems like you also have the dataset like input 1(character)+ input b(pose character) = output(character from input a in pose from input b), hope I am right.

if yes then is it possible for you to share the dataset or what training settings you used an I am not able to make it work fully, I have 54 samples as of now.

nappa114514

Owner Jan 8

•

edited Jan 8

I'm using a dataset in the exact format you described. There are 26 samples in total. However, I'm a bit hesitant to upload it to Hugging Face because it contains NSFW content.

I used musubi-tuner for training.
The command is as follows:

accelerate launch --num_cpu_threads_per_process 1 --mixed_precision bf16 musubi-tuner/src/musubi_tuner/qwen_image_train_network.py `
--dit qwen_image_edit_2511_bf16.safetensors `
--vae diffusion_pytorch_model.safetensors `
--text_encoder qwen_2.5_vl_7b.safetensors `
--dataset_config xxx.toml `
--sdpa --mixed_precision bf16 `
--timestep_sampling shift `
--weighting_scheme none --discrete_flow_shift 2.2 `
--optimizer_type adamw8bit --learning_rate 1e-3 --gradient_checkpointing `
--max_data_loader_n_workers 2 --persistent_data_loader_workers `
--network_module networks.lora_qwen_image `
--network_dim 32 `
--max_train_epochs 40 --save_every_n_epochs 10 --seed 42 `
--output_dir output --output_name yyy `
--model_version edit-2511 --fp8_vl --fp8_base --fp8_scaled --blocks_to_swap 8

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment