Hello
I'd like to ask if it's possible to update the initialization code of the trainer in dpo.py. Since it's already incompatible with the trl code after version 1.00, it's somewhat outdated. This can greatly save everyone's time spent on debugging.
