[PATCH] D118461: [AMDGPU] Introduce new ISel combine for trunc-slr patterns

Thu Feb 3 03:45:51 PST 2022

foad added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:2290
+// v_and_b32_e64 $a, (1 << $b), $a
+// v_cmp_eq_u32_e64 $a, (1 << $b), $a 
+
----------------
tsymalla wrote:
> foad wrote:
> > foad wrote:
> > > `v_cmp_ne_u32_e64 $a, 0, $a` is probably better because 0 is always an inline constant, but `1 << $b` might not be.
> > Also, if you do this, there will only be one use of the constant not two, so I don't think you will have to "Restrict the range to prevent using an additional VGPR for the shifted value".
> The resulting value should still be checked to ensure no 32-bit overflow occurs, correct? For instance, if the shift value is something like 33, 1 << 33 would exceed Int32_Max.
I'm not sure there is any need to check. The result of a shift by 33 is undefined, so it doesn't really matter what code we generate in that case.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118461/new/

https://reviews.llvm.org/D118461