[PATCH] D135316: [RISCV] Use branchless form for selects with -1 in either arm
Philip Reames via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 6 13:44:21 PDT 2022
reames added inline comments.
================
Comment at: llvm/test/CodeGen/RISCV/uadd_sat_plus.ll:19
+; RV32I-NEXT: sltu a0, a1, a0
+; RV32I-NEXT: seqz a0, a0
+; RV32I-NEXT: addi a0, a0, -1
----------------
reames wrote:
> craig.topper wrote:
> > reames wrote:
> > > reames wrote:
> > > > craig.topper wrote:
> > > > > Is the seqz+addi equivalent to neg since a0 is [0,1]?
> > > > It is. Not sure what's causing this to be formed for RV32. We get the not form for RV64.
> > > >
> > > > (I literally just worked through the two cases by hand if you want to double check my reasoning.)
> > > Sorry, typo. We got the *neg* form for RV64.
> > Weird ordering quirk.
> >
> > On RV32, we expand a select_cc that has a uaddo condition. This gives a select_cc with a compare with. The uaddo is expanded later and introduces a setcc. Now we have a select_cc with a setcc as the condition LHS. Your select_cc optimization runs before we get a chance to fold the setcc into the select_cc. This creates a seteq with a setcc input. A combine on the neg creates the addi. We don't have any combine to fix the two setccs together.
> >
> > On RV64, the uaddo is expanded during type legalizaton instead of LegalizeDAG so we form a different select_cc in LegalizeDAG.
> >
> > What happens if you move your new combine below the call to `combine_CC`?
> It isn't clear to me why moving the code below another transform in what is presumably a fixed point loop would cause any difference. I tried it, and confirmed there was no change in this test. Unless maybe I misunderstood you here?
Er ignore for moment. I was on the wrong branch, and dealing with the wrong patch entirely..
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D135316/new/
https://reviews.llvm.org/D135316
More information about the llvm-commits
mailing list