[PATCH] D94203: GlobalISel: Add combine for G_UREM by power of 2

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 7 13:36:25 PST 2021


arsenm added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i32.ll:210
 ; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; CHECK-NEXT:    s_movk_i32 s4, 0x1000
-; CHECK-NEXT:    v_mov_b32_e32 v1, 0xfffff000
-; CHECK-NEXT:    v_cvt_f32_u32_e32 v2, s4
-; CHECK-NEXT:    v_rcp_iflag_f32_e32 v2, v2
-; CHECK-NEXT:    v_mul_f32_e32 v2, 0x4f7ffffe, v2
-; CHECK-NEXT:    v_cvt_u32_f32_e32 v2, v2
-; CHECK-NEXT:    v_mul_lo_u32 v1, v1, v2
-; CHECK-NEXT:    v_mul_hi_u32 v1, v2, v1
-; CHECK-NEXT:    v_add_i32_e32 v1, vcc, v2, v1
-; CHECK-NEXT:    v_mul_hi_u32 v1, v0, v1
-; CHECK-NEXT:    v_lshlrev_b32_e32 v1, 12, v1
-; CHECK-NEXT:    v_sub_i32_e32 v0, vcc, v0, v1
-; CHECK-NEXT:    v_subrev_i32_e32 v1, vcc, s4, v0
-; CHECK-NEXT:    v_cmp_le_u32_e32 vcc, s4, v0
-; CHECK-NEXT:    v_cndmask_b32_e32 v0, v0, v1, vcc
-; CHECK-NEXT:    v_subrev_i32_e32 v1, vcc, s4, v0
-; CHECK-NEXT:    v_cmp_le_u32_e32 vcc, s4, v0
-; CHECK-NEXT:    v_cndmask_b32_e32 v0, v0, v1, vcc
+; CHECK-NEXT:    s_add_i32 s4, 0x1000, -1
+; CHECK-NEXT:    v_and_b32_e32 v0, s4, v0
----------------
foad wrote:
> arsenm wrote:
> > foad wrote:
> > > Maybe this is the same question as above, but: why doesn't this get constant folded to an s_mov ?
> > We're missing constant folding combines for most operations. The only constant folding done is the implicit folding done by the MIRBuilder when the original inputs are constants
> Right but you use the builder to build this add: `auto Add = Builder.buildAdd(Ty, Pow2Src1, NegOne);`
> So why wouldn't it get constant folded at that point?
It looks like this isn't actually using the constant folding builder


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94203/new/

https://reviews.llvm.org/D94203



More information about the llvm-commits mailing list