-
Notifications
You must be signed in to change notification settings - Fork 207
Open
Description
大家好!感谢大家对ROLL的关注。
ROLL近期更新了大量新功能,以下是近期更新的一些梳理,我们将持续对ROLL进行迭代更新,欢迎加入ROLL的社区。
🚀亮点:
- 新增模型支持:Qwen3-VL、Qwen3-MoE-VL、Qwen3-Omni、GLM-4.7
- agentic 训练与 Rollout GPU部分重叠,训练空闲GPU切换为Rollout
- DynamicSamplingScheduler协程化重构
- 新增: FSDP2 Strategy
- 训练支持 Sequence packing 和 Dynamic batching
🚀主要新特性:
- Rollout
- DynamicSamplingScheduler协程化重构
- 自定义rollout pre/post process, 支持动态samping param、多阶段生成、ThinkingBudget控制
- Sglang: Strategy重构,支持server模式,onload/offload native化,inflight FP8 quant rollout,跨机多节点部署
- vLLM:DP/EP 支持, 支持vllm==0.12.0
- 提供AgentNative Rollout范式,AgentNativeStepEnvManager + SokobanNativeEnv,完全由env进行上下文管理
- Async Rollout Hang Detect:增加异步 Rollout 卡死检测,快速定位问题env
- 支持rollout dump & mock,提高forward/train阶段精度对齐效率
- agentic pipeline支持 train-val/rollout overlap
- Training
- FSDP2
- Megatron support LoRA, LoRA RL blogs:https://macaron.im/mindlab/research/building-trillion-parameter-reasoning-rl-with-10-gpus
- megatron训练时在线保存hf格式的模型参数
- support FP8 training for Megatron Strategy
- Sequence packing,微调loss_func接口定义
- Dynamic batching
- Add DeepSpeed SFT support
- Model Update实现优化:消除机间冗余、权重转换和nccl broadcast overlap、优化host to device、多pp串行同步调整为lock模式同时同步
- Asynchronous Feature
- 训练与 Rollout GPU部分重叠,训练空闲GPU切换为Rollout,report: https://arxiv.org/abs/2512.24873
- agentic off policy loss 与 IS 修正
- Pipeline recipe
- VLM image tool use: DeepEyes,工具调用与reward计算overlap
- Models:新增模型支持 Qwen3-VL、Qwen3-MoE-VL、Qwen3-Omni-Thinker、GLM-4.7
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels