[PATCH] D150587: [KnownBits] Make shl/lshr/ashr implementations optimal

Mon May 15 12:43:44 PDT 2023

nikic added inline comments.

================
Comment at: llvm/test/CodeGen/AMDGPU/amdgpu.private-memory.ll:224
 ; SI-PROMOTE-VECT: s_lshr_b32 [[SREG:s[0-9]+]], 0x10000, [[SCALED_IDX]]
-; SI-PROMOTE-VECT: s_and_b32 s{{[0-9]+}}, [[SREG]], 0xffff
+; SI-PROMOTE-VECT: s_and_b32 s{{[0-9]+}}, [[SREG]], 1
 define amdgpu_kernel void @short_array(ptr addrspace(1) %out, i32 %index) #0 {
----------------
goldstein.w.n wrote:
> nikic wrote:
> > I believe this is correct, because SCALED_IDX is `IDX << 4` and as such at least 16. As such, `0x10000 >> SCALED_IDX` is either zero or one and the and mask can be narrowed to 1.
> Was the old value buggy then?
No, it's also correct, the constant is just unnecessarily wide. I believe this is done as part of demanded bits simplification.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150587/new/

https://reviews.llvm.org/D150587