[llvm] [AMDGPU] Select v_lshl_add_u32 instead of v_mul_lo_u32 by constant (PR #71035)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 2 01:28:26 PDT 2023


================
@@ -1194,7 +1206,7 @@ define amdgpu_kernel void @sub_i32_constant(ptr addrspace(1) %out, ptr addrspace
 ; GFX10W64-NEXT:    s_load_dwordx2 s[0:1], s[0:1], 0x24
 ; GFX10W64-NEXT:    s_waitcnt vmcnt(0)
 ; GFX10W64-NEXT:    v_readfirstlane_b32 s2, v1
-; GFX10W64-NEXT:    v_mul_u32_u24_e32 v0, 5, v0
+; GFX10W64-NEXT:    v_lshl_add_u32 v0, v0, 2, v0
----------------
arsenm wrote:

This is worse, 24-bit instructions are fast and have a smaller encoding 

https://github.com/llvm/llvm-project/pull/71035


More information about the llvm-commits mailing list