[llvm] [AMDGPU] Select v_lshl_add_u32 instead of v_mul_lo_u32 by constant (PR #71035)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Fri Nov 3 04:24:46 PDT 2023
================
@@ -83,8 +83,7 @@ define amdgpu_kernel void @add_i32_constant(ptr addrspace(1) %out, ptr addrspace
; GFX9-NEXT: ; %bb.1:
; GFX9-NEXT: s_load_dwordx4 s[8:11], s[0:1], 0x34
; GFX9-NEXT: s_bcnt1_i32_b64 s4, s[4:5]
-; GFX9-NEXT: s_mul_i32 s4, s4, 5
-; GFX9-NEXT: v_mov_b32_e32 v1, s4
+; GFX9-NEXT: v_lshl_add_u32 v1, s4, 2, s4
----------------
jayfoad wrote:
`s_mul_i32` is fast (unlike `v_mul_lo_u32`) so this change is not really good. (Actually it looks OK in this case because the result needs to be copied to a VGPR, but in general we do not know about that when we select instructions.)
https://github.com/llvm/llvm-project/pull/71035
More information about the llvm-commits
mailing list