I’m experiencing difficulties fine-tuning LoRa using Diffusion Pipe to generate hyperrealistic videos of objects. Despite trying various datasets, adjusting the training steps, and experimenting with the training duration, I haven't been able to achieve satisfactory results.
For instance, one specific challenge I face is ensuring that text, such as the title on a book cover, appears clear and accurate in the output. So far, the results are inconsistent, and I’m not sure what adjustments I should make to improve the quality.
Do you have any suggestions for settings, best practices, or procedures that could help me fine-tune LoRa more effectively in this context? Any guidance would be greatly appreciated.
Thank you!