-
Notifications
You must be signed in to change notification settings - Fork 712
Description
Hi, I can run with the single person scenario. It works perfect, which proves my setting up and hardwares are all good.
But when I switch to multiple person, it does not work. Only one person is driven, while the other one's lip does not match the audio -- the audios can be played properly.
Here is my configuration json file:
{
"prompt": "Inside a cozy, softly lit café with warm ambient bokeh lights in the background, two young women sit together at a round wooden table. Both women are focused on their smartphones. The woman on the left has straight light-brown hair and wears a thick white knitted sweater; she sits and looks down at her phone. The woman on the right has a blonde bob haircut and wears a deep green cardigan over a white top, smiling slightly as she looks at her phone. Two white ceramic coffee cups rest on saucers in front of them, faint steam rising gently. The atmosphere feels warm, relaxed, and intimate, with a shallow depth of field emphasizing the subjects against a softly blurred café interior. Medium shot, natural lighting, realistic cinematic style. 9:16 vertical formatting.",
"cond_video": "examples/growth/batch_2/version2/banana_resized.png",
"audio_type": "para",
"cond_audio": {
"person1": "examples/growth/batch_2/audio_person_1.wav",
"person2": "examples/growth/batch_2/audio_person_2.wav"
},
"bbox": {
"person1": [32, 355, 314, 679],
"person2": [370, 361, 696, 660]
}
}
I use this cmd:
python generate_infinitetalk.py --ckpt_dir weights/Wan2.1-I2V-14B-480P --wav2vec_dir 'weights/chinese-wav2vec2-base' --infinitetalk_dir weights/InfiniteTalk/multi/infinitetalk.safetensors --lora_dir weights/FusionX_LoRa/Wan2.1_I2V_14B_FusionX_LoRA.safetensors --lora_scale 1.0 --size infinitetalk-720 --sample_text_guide_scale 1.0 --sample_audio_guide_scale 2.0 --sample_steps 8 --mode streaming --motion_frame 9 --sample_shift 2 --max_frame_num 550 --save_file examples/growth/sora_2girls/crimeradar_sora_2girls_480swtich_lora --input_json examples/multi_growth_video.json
Is there anything wrong here? Thank you.