Skip to content

Conversation

@fish-jiang
Copy link

@fish-jiang fish-jiang commented Feb 4, 2026

Summary
In this PR we

  1. enabled coopmat on Xe1
  2. did warp tuning on Xe2/Xe3
  3. optimized coopmat1 shader code to reduce unnecessary memory load and tensor core calculation for MoE model.

@fish-jiang fish-jiang requested a review from 0cc4m as a code owner February 4, 2026 06:09
@h9j6k
Copy link

h9j6k commented Feb 4, 2026

Bringing older xe devices in line is great. Does the 'Xe1' you are referring to include intel DG1, the very first gen of discrete intel xe? Or this Xe1 coopmat enablement is for DG2 onwards like alchemist A770/A380 etc (gfx 12.5+)? Thanks.

@github-actions github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Feb 4, 2026
@fish-jiang fish-jiang marked this pull request as draft February 4, 2026 08:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants