[llvm] [AMDGPU] Fix GFX11 WMMA intrinsic lowering regression for compute kernels (PR #164036)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 20 01:08:56 PDT 2025
jayfoad wrote:
> Add explicit high-priority (AddedComplexity=10000) patterns that match bare intrinsic calls directly without requiring VOP3PMods wrappers. These patterns provide default zero modifiers to the instruction format and override the broken patterns.
I'm not a big fan of this approach, because generally AddedComplexity should only be used as a cost model to prefer one pattern over another, it should not be used to fix correctness issues. "Broken" patterns should rather be disabled using predicates.
So maybe the ROCm-style fix is preferable? But I have not looked at it closely, and to be honest I do not understand the root cause of the problem you are fixing.
https://github.com/llvm/llvm-project/pull/164036
More information about the llvm-commits
mailing list