[llvm] [AMDGPU] Select v_lshl_add_u32 instead of v_mul_lo_u32 by constant (PR #71035)

Jay Foad via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 3 04:24:46 PDT 2023


================
@@ -83,8 +83,7 @@ define amdgpu_kernel void @add_i32_constant(ptr addrspace(1) %out, ptr addrspace
 ; GFX9-NEXT:  ; %bb.1:
 ; GFX9-NEXT:    s_load_dwordx4 s[8:11], s[0:1], 0x34
 ; GFX9-NEXT:    s_bcnt1_i32_b64 s4, s[4:5]
-; GFX9-NEXT:    s_mul_i32 s4, s4, 5
-; GFX9-NEXT:    v_mov_b32_e32 v1, s4
+; GFX9-NEXT:    v_lshl_add_u32 v1, s4, 2, s4
----------------
jayfoad wrote:

`s_mul_i32` is fast (unlike `v_mul_lo_u32`) so this change is not really good. (Actually it looks OK in this case because the result needs to be copied to a VGPR, but in general we do not know about that when we select instructions.)

https://github.com/llvm/llvm-project/pull/71035


More information about the llvm-commits mailing list