[PATCH] D94203: GlobalISel: Add combine for G_UREM by power of 2
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 7 13:36:25 PST 2021
arsenm added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i32.ll:210
; CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; CHECK-NEXT: s_movk_i32 s4, 0x1000
-; CHECK-NEXT: v_mov_b32_e32 v1, 0xfffff000
-; CHECK-NEXT: v_cvt_f32_u32_e32 v2, s4
-; CHECK-NEXT: v_rcp_iflag_f32_e32 v2, v2
-; CHECK-NEXT: v_mul_f32_e32 v2, 0x4f7ffffe, v2
-; CHECK-NEXT: v_cvt_u32_f32_e32 v2, v2
-; CHECK-NEXT: v_mul_lo_u32 v1, v1, v2
-; CHECK-NEXT: v_mul_hi_u32 v1, v2, v1
-; CHECK-NEXT: v_add_i32_e32 v1, vcc, v2, v1
-; CHECK-NEXT: v_mul_hi_u32 v1, v0, v1
-; CHECK-NEXT: v_lshlrev_b32_e32 v1, 12, v1
-; CHECK-NEXT: v_sub_i32_e32 v0, vcc, v0, v1
-; CHECK-NEXT: v_subrev_i32_e32 v1, vcc, s4, v0
-; CHECK-NEXT: v_cmp_le_u32_e32 vcc, s4, v0
-; CHECK-NEXT: v_cndmask_b32_e32 v0, v0, v1, vcc
-; CHECK-NEXT: v_subrev_i32_e32 v1, vcc, s4, v0
-; CHECK-NEXT: v_cmp_le_u32_e32 vcc, s4, v0
-; CHECK-NEXT: v_cndmask_b32_e32 v0, v0, v1, vcc
+; CHECK-NEXT: s_add_i32 s4, 0x1000, -1
+; CHECK-NEXT: v_and_b32_e32 v0, s4, v0
----------------
foad wrote:
> arsenm wrote:
> > foad wrote:
> > > Maybe this is the same question as above, but: why doesn't this get constant folded to an s_mov ?
> > We're missing constant folding combines for most operations. The only constant folding done is the implicit folding done by the MIRBuilder when the original inputs are constants
> Right but you use the builder to build this add: `auto Add = Builder.buildAdd(Ty, Pow2Src1, NegOne);`
> So why wouldn't it get constant folded at that point?
It looks like this isn't actually using the constant folding builder
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D94203/new/
https://reviews.llvm.org/D94203
More information about the llvm-commits
mailing list