Improve PRM inference speed by RyanLiu112 · Pull Request #48 · RLHFlow/RLHF-Reward-Modeling

RyanLiu112 · 2024-12-23T07:53:50Z

Hi! This pr is to improve the inference speed of PRM. Specifically, I replace + with<|reserved_special_token_0|> to obtain the indices of all steps in advance. This change allows us to infer all step rewards with a single forward pass.

Improve PRM inference speed

1770084

CJReinforce mentioned this pull request Jan 25, 2025

Improve PRM inference speed #53

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Improve PRM inference speed#48

Improve PRM inference speed#48
RyanLiu112 wants to merge 1 commit intoRLHFlow:mainfrom
RyanLiu112:main

RyanLiu112 commented Dec 23, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

RyanLiu112 commented Dec 23, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant