Skip to content

Conversation

@fy1214
Copy link
Contributor

@fy1214 fy1214 commented Dec 21, 2025

support Int4 qat in slime, currently we do two things in this pr:

  1. support int fake quant MOE expert in Megatron by env flag.
  2. support slime quant_param_int4

1. qat patch in megatron
2. int4 quant weight for sglang
1. qat patch in megatron
2. int4 quant weight for sglang
1. qat patch in megatron
2. int4 quant weight for sglang
1. qat patch in megatron
2. int4 quant weight for sglang
1. qat patch in megatron
2. int4 quant weight for sglang
1. add sglang new patch file
2. add new arg int4-params-rollout
3. add new request post_process_weights in both update_weight_from_distributed.py and update_weight_from_tensor.py
1. fix sgl-int4.patch bug
1. add SGLangEngine post_process_weights api
1. add convert_hf_to_hf_int4.py script to create int4 weight.
2. add README.md to tell how to use int4
3. add int4 train script
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant