[PATCH] D94203: GlobalISel: Add combine for G_UREM by power of 2

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 7 06:45:00 PST 2021


foad added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i32.ll:210
 ; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; CHECK-NEXT:    s_movk_i32 s4, 0x1000
-; CHECK-NEXT:    v_mov_b32_e32 v1, 0xfffff000
-; CHECK-NEXT:    v_cvt_f32_u32_e32 v2, s4
-; CHECK-NEXT:    v_rcp_iflag_f32_e32 v2, v2
-; CHECK-NEXT:    v_mul_f32_e32 v2, 0x4f7ffffe, v2
-; CHECK-NEXT:    v_cvt_u32_f32_e32 v2, v2
-; CHECK-NEXT:    v_mul_lo_u32 v1, v1, v2
-; CHECK-NEXT:    v_mul_hi_u32 v1, v2, v1
-; CHECK-NEXT:    v_add_i32_e32 v1, vcc, v2, v1
-; CHECK-NEXT:    v_mul_hi_u32 v1, v0, v1
-; CHECK-NEXT:    v_lshlrev_b32_e32 v1, 12, v1
-; CHECK-NEXT:    v_sub_i32_e32 v0, vcc, v0, v1
-; CHECK-NEXT:    v_subrev_i32_e32 v1, vcc, s4, v0
-; CHECK-NEXT:    v_cmp_le_u32_e32 vcc, s4, v0
-; CHECK-NEXT:    v_cndmask_b32_e32 v0, v0, v1, vcc
-; CHECK-NEXT:    v_subrev_i32_e32 v1, vcc, s4, v0
-; CHECK-NEXT:    v_cmp_le_u32_e32 vcc, s4, v0
-; CHECK-NEXT:    v_cndmask_b32_e32 v0, v0, v1, vcc
+; CHECK-NEXT:    s_add_i32 s4, 0x1000, -1
+; CHECK-NEXT:    v_and_b32_e32 v0, s4, v0
----------------
arsenm wrote:
> foad wrote:
> > Maybe this is the same question as above, but: why doesn't this get constant folded to an s_mov ?
> We're missing constant folding combines for most operations. The only constant folding done is the implicit folding done by the MIRBuilder when the original inputs are constants
Right but you use the builder to build this add: `auto Add = Builder.buildAdd(Ty, Pow2Src1, NegOne);`
So why wouldn't it get constant folded at that point?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94203/new/

https://reviews.llvm.org/D94203



More information about the llvm-commits mailing list