Skip to content

Comments

Improve PRM inference speed#48

Open
RyanLiu112 wants to merge 1 commit intoRLHFlow:mainfrom
RyanLiu112:main
Open

Improve PRM inference speed#48
RyanLiu112 wants to merge 1 commit intoRLHFlow:mainfrom
RyanLiu112:main

Conversation

@RyanLiu112
Copy link

Hi! This pr is to improve the inference speed of PRM. Specifically, I replace + with<|reserved_special_token_0|> to obtain the indices of all steps in advance. This change allows us to infer all step rewards with a single forward pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant