Skip to content

bruno686/Clear-R1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Clear-R1

Unlike other repositories that focus more on curve analysis, we emphasize interesting phenomena and process analysis to make it easier for everyone to understand what is happening.

Disclaimer!

This repository's code is inspired by DeepSeek-RL-Qwen-0.5B-GRPO-gsm8k and deepseek_r1_train and is not original. I have only conducted experimental analysis based on it. Thanks for their endeavor!

Bitter Lessons

Don't use models that are not INSTRUCTION! But, Why?

Qwen-2.5-3B output qwen-2.5-3B

Qwen-2.5-3B-GRPO output

About

The Simplest R1

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages