[llvm-branch-commits] [llvm] [AMDGPU] Codegen support for constrained multi-dword sloads (PR #96163)
Jay Foad via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Mon Jul 22 06:03:36 PDT 2024
================
@@ -34,18 +34,17 @@ entry:
}
define amdgpu_kernel void @test_llvm_amdgcn_fdot2_bf16_bf16_dpp(
-; SDAG-GFX11-LABEL: test_llvm_amdgcn_fdot2_bf16_bf16_dpp:
-; SDAG-GFX11: ; %bb.0: ; %entry
-; SDAG-GFX11-NEXT: s_load_b128 s[0:3], s[0:1], 0x24
-; SDAG-GFX11-NEXT: s_waitcnt lgkmcnt(0)
-; SDAG-GFX11-NEXT: scratch_load_b32 v0, off, s2
-; SDAG-GFX11-NEXT: scratch_load_u16 v1, off, s3
-; SDAG-GFX11-NEXT: scratch_load_b32 v2, off, s1
-; SDAG-GFX11-NEXT: s_waitcnt vmcnt(0)
-; SDAG-GFX11-NEXT: v_dot2_bf16_bf16_e64_dpp v0, v2, v0, v1 quad_perm:[1,0,0,0] row_mask:0xf bank_mask:0xf bound_ctrl:1
-; SDAG-GFX11-NEXT: scratch_store_b16 off, v0, s0
-; SDAG-GFX11-NEXT: s_endpgm
-;
+; GFX11-LABEL: test_llvm_amdgcn_fdot2_bf16_bf16_dpp:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_load_b128 s[0:3], s[0:1], 0x24
+; GFX11-NEXT: s_waitcnt lgkmcnt(0)
+; GFX11-NEXT: scratch_load_b32 v0, off, s2
+; GFX11-NEXT: scratch_load_u16 v1, off, s3
+; GFX11-NEXT: scratch_load_b32 v2, off, s1
+; GFX11-NEXT: s_waitcnt vmcnt(0)
+; GFX11-NEXT: v_dot2_bf16_bf16_e64_dpp v0, v2, v0, v1 quad_perm:[1,0,0,0] row_mask:0xf bank_mask:0xf bound_ctrl:1
+; GFX11-NEXT: scratch_store_b16 off, v0, s0
+; GFX11-NEXT: s_endpgm
; GISEL-GFX11-LABEL: test_llvm_amdgcn_fdot2_bf16_bf16_dpp:
----------------
jayfoad wrote:
Should probably remove these GISEL-GFX11 checks since the corresponding RUN line is disabled.
https://github.com/llvm/llvm-project/pull/96163
More information about the llvm-branch-commits
mailing list