[PATCH] D62266: [DAGCombine][X86][AArch64][ARM] (C - x) + y -> (y - x) + C fold

Roman Lebedev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed May 22 14:24:02 PDT 2019


lebedev.ri added a comment.

In D62266#1512613 <https://reviews.llvm.org/D62266#1512613>, @efriedma wrote:

> > So the DAG is simpler, but that particular case is worse for the backend.
>
> It's "simpler", in some sense, but it's more complicated to lower because it's using the carry output of a uaddo in a non-addcarry operation, which is legal, but more expensive than using in an addcarry, particularly on ARM.  I think the missing combine is that we should be able to turn the subtraction into some form of addcarry/subcarry.


Hmm, that does not seem to be all.
I've added

  bool HasSUBCARRY = TLI.isOperationLegalOrCustom(ISD::SUBCARRY, VT);
  bool HasADDCARRY = TLI.isOperationLegalOrCustom(ISD::ADDCARRY, VT);
  if (HasSUBCARRY || HasADDCARRY) {
    unsigned ANYCARRY = HasSUBCARRY ? ISD::SUBCARRY : ISD::ADDCARRY;
    // (sub X, Carry) -> (subcarry/addcarry X, 0, Carry)
    if (SDValue Carry = getAsCarry(TLI, N1)) {
      return DAG.getNode(ANYCARRY, DL, DAG.getVTList(VT, Carry.getValueType()),
                         N0, DAG.getConstant(0, DL, VT), Carry);
    }
  
    if (HasSUBCARRY) {
      // (sub Carry, X) -> (subcarry 0, X, Carry)
      if (SDValue Carry = getAsCarry(TLI, N0)) {
        return DAG.getNode(ISD::SUBCARRY, DL,
                           DAG.getVTList(VT, Carry.getValueType()),
                           DAG.getConstant(0, DL, VT), N1, Carry);
      }
    }
  }

at the end of `DAGCombiner::visitSUB()`, and get this

  SelectionDAG has 17 nodes:
    t0: ch = EntryToken
            t6: i32,ch = CopyFromReg t0, Register:i32 %2
              t4: i32,ch = CopyFromReg t0, Register:i32 %1
              t2: i32,ch = CopyFromReg t0, Register:i32 %0
            t30: i32,i32 = uaddo t4, t2
          t31: i32,i32 = subcarry Constant:i32<0>, t6, t30:1
        t39: i32 = and t31, Constant:i32<65535>
      t40: ch = br_cc t0, seteq:ch, t39, Constant:i32<65535>, BasicBlock:ch<if.end 0x55706fb58828>
    t19: ch = br t40, BasicBlock:ch<for.cond.preheader 0x55706fb58698>

which is then resulting in

  SelectionDAG has 22 nodes:
    t0: ch = EntryToken
              t6: i32,ch = CopyFromReg t0, Register:i32 %2
                      t4: i32,ch = CopyFromReg t0, Register:i32 %1
                      t2: i32,ch = CopyFromReg t0, Register:i32 %0
                    t51: i32,i32 = ARMISD::ADDC t4, t2
                  t52: i32,i32 = ARMISD::ADDE Constant:i32<0>, Constant:i32<0>, t51:1
                t45: i32 = sub Constant:i32<1>, t52
              t46: i32,i32 = ARMISD::SUBC t45, Constant:i32<1>
            t47: i32,i32 = ARMISD::SUBE Constant:i32<0>, t6, t46:1
          t39: i32 = and t47, Constant:i32<65535>
        t41: glue = ARMISD::CMPZ t39, Constant:i32<65535>
      t43: ch = ARMISD::BRCOND t0, BasicBlock:ch<if.end 0x55706fb58828>, Constant:i32<0>, Register:i32 $cpsr, t41
    t19: ch = br t43, BasicBlock:ch<for.cond.preheader 0x55706fb58698>

So some other ARM-specific combine is missing, too?
Or am i totally misreading this?


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62266/new/

https://reviews.llvm.org/D62266





More information about the llvm-commits mailing list