`evaluate`, `log`, and `save_model` functions unified implementation across all continual RLHF methods

Each method is doing it's own way of logging the metrics both for train and eval. However, in the process of unifying everything it'd be nice to have one unified `log` and `evaluate` functions for all classes.

We could potentially differentiate between two superscripts, one for `dpo` based models and one for `ppo` based ones and just import them for convenience in other functions.