vulkan: optimized coopmat matmul perf for IntelGPU #19320

fish-jiang · 2026-02-04T06:09:04Z

Summary
In this PR we

enabled coopmat on Xe1
did warp tuning on Xe2/Xe3
optimized coopmat1 shader code to reduce unnecessary memory load and tensor core calculation for MoE model.

h9j6k · 2026-02-04T06:57:14Z

Bringing older xe devices in line is great. Does the 'Xe1' you are referring to include intel DG1, the very first gen of discrete intel xe? Or this Xe1 coopmat enablement is for DG2 onwards like alchemist A770/A380 etc (gfx 12.5+)? Thanks.

fish-jiang requested a review from 0cc4m as a code owner February 4, 2026 06:09

vulkan: optimized coopmat matmul perf for IntelGPU

215a103

fish-jiang force-pushed the Xetuning branch from 65c3740 to 215a103 Compare February 4, 2026 06:21

github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Feb 4, 2026

fish-jiang marked this pull request as draft February 4, 2026 08:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vulkan: optimized coopmat matmul perf for IntelGPU #19320

vulkan: optimized coopmat matmul perf for IntelGPU #19320

fish-jiang commented Feb 4, 2026 •

edited

Loading

Uh oh!

h9j6k commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vulkan: optimized coopmat matmul perf for IntelGPU #19320

Are you sure you want to change the base?

vulkan: optimized coopmat matmul perf for IntelGPU #19320

Conversation

fish-jiang commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

h9j6k commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fish-jiang commented Feb 4, 2026 •

edited

Loading