feat: add expert_wise_scale support for per-expert FP8 quantization in MoE models#35
Open
lifelongeeek wants to merge 2 commits intoaws-neuron:mainfrom
Open
feat: add expert_wise_scale support for per-expert FP8 quantization in MoE models#35lifelongeeek wants to merge 2 commits intoaws-neuron:mainfrom
lifelongeeek wants to merge 2 commits intoaws-neuron:mainfrom