Skip to content

Bug fixes in reward utils#3

Open
insop wants to merge 1 commit intomainfrom
codex/find-and-fix-a-bug-in-codebase
Open

Bug fixes in reward utils#3
insop wants to merge 1 commit intomainfrom
codex/find-and-fix-a-bug-in-codebase

Conversation

@insop
Copy link
Owner

@insop insop commented Jun 8, 2025

Summary

  • prevent in-place modification of labels when computing log probabilities
  • correct return type for format_reward_func

Testing

  • python -m py_compile kernel-coder/utils.py kernel-coder/nano_r1_script.py scripts/kernelllm.py

https://chatgpt.com/codex/tasks/task_e_68461478a650832cbca1e1eb3396a49f

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments