Actions: huggingface/trl
Actions
Showing runs from all workflows
35,014 workflow runs
35,014 workflow runs
grpo_trainer.py): Variational Sequence-Level Soft Policy Optimization (VESPO)
Build PR Documentation
#14123:
Pull request #5199
synchronize
by
casinca
grpo_trainer.py): Variational Sequence-Level Soft Policy Optimization (VESPO)
Tests
#15156:
Pull request #5199
synchronize
by
casinca
model_kwargs are not used by the model: ['mm_token_type_ids']
Hugging Face Issue Labeler
#989:
Issue #5201
opened
by
albertvillanova