V12.1 problems
When using V12.1 to generate video from images, the facial features of the person at the beginning of the video maintain good consistency with those in the initial image, but quickly change to different features and then remain fixed. This is very serious, directly resulting in a poor user experience for the generated video, making it unusable. The Remix high/low noise model, however, maintains consistent consistency throughout. But the Remix model is too complex, and I genuinely want to use a simpler model like Rapid-Aio. How can I completely solve the problem of consistency in facial and body part features? Also, I hope Rapid-Aio can optimize the adherence to cues. For example, in cues like "The character turns 360 degrees in place, eventually returning to face the camera, with the head and limbs moving in tandem during the turn," the phrase "the head and limbs moving in tandem during the turn" must be added; otherwise, the limbs will appear distorted and disordered. Even with this cue, after the character turns 180 degrees away from the camera, the face abruptly returns to a frontal position, which is quite startling. I hope Rapid-Aio can optimize this aspect.
Summary:
There are issues with the consistency of character appearances in the image-generated videos.
The model's adherence to or understanding of cues is insufficient.
It is recommended that rapid-aio conduct thorough testing before integrating the new LoRa model, as many people are using this model and are eagerly anticipating its implementation.
使用V12.1进行图生视频时,视频开始人物脸部与起始图像里的人物容貌特征能保持很好的一致性,但是迅速会变成另外的容貌特征并后续固定。这非常严重,直接导致人物图生视频的体验很差,无法正常使用。而隔壁的remix高低噪模型就能始终保持一致性。但是remix模型太复杂,我是真心想使用rapid-aio这种简单模型。不知道怎么彻底解决角色容貌、身体部位特征一致性问题?另外提示词的遵循程度,希望repid-aio能优化,比如“角色原地转身360度,最终回到正面朝向镜头,转身期间头与四肢联动转向”,这种提示词里,必须要添加“转身期间头与四肢联动转向”这句话,否则四肢会扭曲错乱。即使添加了这一句提示词,会发现角色转身180度背朝镜头后,脸部会很突兀地直接变回正面,非常吓人。希望repid-aio在这方面能优化。
总结:
1、图生视频角色容貌一致性存在问题
2、模型对于提示词的遵循或理解不足。
建议rapid-aio整合新loras模型时,最好进行细致的测试,毕竟很多人在用这个模型,很期待
This Rapid Mega "AIO" is based off of VACE + v12.1 NSFW model includes lots of LORAs that have some residual faces in the training data. This can cause slight variances in facial features. However, VACE does allow you to do "start to final frame", which really is a great solution (and the one I generally prefer), as you can have the final frame have the face you are looking for. I recommend using my Qwen AIO to generate a final frame with the character in the exact position you want.
as an addition - there is also 'reference image' in the wanvieoVACE, which may help you also