[llvm] [AMDGPU] Mark WMMA machine instructions as convergent (PR #165602)
Syadus Sefat via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 31 16:05:22 PDT 2025
================
@@ -1906,8 +1906,10 @@ defm V_WMMA_SCALE_F32_32X16X128_F4_w32 : WMMAInstGFX12<"v_wmma_scale_f32_32x16
defm V_WMMA_SCALE16_F32_32X16X128_F4_w32 : WMMAInstGFX12<"v_wmma_scale16_f32_32x16x128_f4", F32_32X16X128_F4_SCALE16_w32, "_w32">;
} // End is_wmma_xdl = 1.
-defm V_WMMA_LD_SCALE_PAIRED_B32 : VOP3PInst<"v_wmma_ld_scale_paired_b32", VOP_WMMA_LD_SCALE<i32, VCSrc_b32_Lo256>>;
-defm V_WMMA_LD_SCALE16_PAIRED_B64 : VOP3PInst<"v_wmma_ld_scale16_paired_b64", VOP_WMMA_LD_SCALE<i64, VCSrc_b64_Lo256>>;
+let isConvergent = 1 in {
----------------
mssefat wrote:
I think we should avoid moving isConvergent = 1 directly into VOP_WMMA_LD_SCALE since it extends a profile class and isConvergent is an instruction property.
Instead we can create VOP3PInst_WMMA_LD that extends VOP3PInst specifically for WMMA_LD_SCALE_PAIRED instructions:
```tablegen
multiclass VOP3PInst_WMMA_LD<string OpName, VOPProfile P, SDPatternOperator node = null_frag> {
let isConvergent = 1 in {
defm NAME : VOP3PInst<OpName, P, node>;
}
}
```
Then we can use it for WMMA LD_SCALE_PAIRED instructions:
```tablegen
defm V_WMMA_LD_SCALE_PAIRED_B32 : VOP3PInst_WMMA_LD<"v_wmma_ld_scale_paired_b32", VOP_WMMA_LD_SCALE<i32, VCSrc_b32_Lo256>>;
defm V_WMMA_LD_SCALE16_PAIRED_B64 : VOP3PInst_WMMA_LD<"v_wmma_ld_scale16_paired_b64", VOP_WMMA_LD_SCALE16<i64, VCSrc_b64_Lo256>>;
```
Please let me know if this looks better to you or you suggest me to stick with the existing one.
https://github.com/llvm/llvm-project/pull/165602
More information about the llvm-commits
mailing list