[llvm] [AMDGPU] Add tests for vector rebroadcast. (PR #91322)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Mon May 13 05:10:20 PDT 2024


================
@@ -0,0 +1,1871 @@
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 < %s | FileCheck -check-prefix=GFX9 %s
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1010 < %s | FileCheck -check-prefix=GFX10 %s
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1100 < %s | FileCheck -check-prefix=GFX11 %s
+
+define <2 x i8> @shuffle_v2i8_rebroadcast(ptr addrspace(1) %arg0) {
+; GFX9-LABEL: shuffle_v2i8_rebroadcast:
+; GFX9:       ; %bb.0: ; %entry
+; GFX9-NEXT:  s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX9-NEXT:  global_load_ushort v0, v[0:1], off
+; GFX9-NEXT:  s_waitcnt vmcnt(0)
+; GFX9-NEXT:  v_lshrrev_b16_e32 v0, 8, v0
+; GFX9-NEXT:  v_mov_b32_e32 v1, v0
+; GFX9-NEXT:  s_setpc_b64 s[30:31]
+;
+; GFX10-LABEL: shuffle_v2i8_rebroadcast:
+; GFX10:       ; %bb.0: ; %entry
+; GFX10-NEXT:  s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX10-NEXT:  global_load_ushort v0, v[0:1], off
+; GFX10-NEXT:  s_waitcnt vmcnt(0)
+; GFX10-NEXT:  v_lshrrev_b16 v0, 8, v0
+; GFX10-NEXT:  v_mov_b32_e32 v1, v0
+; GFX10-NEXT:  s_setpc_b64 s[30:31]
+;
+; GFX11-LABEL: shuffle_v2i8_rebroadcast:
+; GFX11:       ; %bb.0: ; %entry
+; GFX11-NEXT:  s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT:  global_load_u16 v0, v[0:1], off
+; GFX11-NEXT:  s_waitcnt vmcnt(0)
+; GFX11-NEXT:  v_lshrrev_b16 v0, 8, v0
+; GFX11-NEXT:  s_delay_alu instid0(VALU_DEP_1)
+; GFX11-NEXT:  v_mov_b32_e32 v1, v0
+; GFX11-NEXT:  s_setpc_b64 s[30:31]
+entry:
+  %val0 = load <2 x i8>, ptr addrspace(1) %arg0
+  %val1 = shufflevector <2 x i8> %val0, <2 x i8> undef, <2 x i32> <i32 1, i32 1>
----------------
arsenm wrote:

Use poison in place of undef throughout 

https://github.com/llvm/llvm-project/pull/91322


More information about the llvm-commits mailing list