[llvm] r242436 - AArch64: Implement conditional compare sequence matching.

Arnaud A. de Grandmaison arnaud.degrandmaison at arm.com
Fri Jul 17 08:39:08 PDT 2015


Hi Matthias,

Thanks for working on this ! It brings some nice improvements. We however
found a 11% regression in one of our performance benchmark on cortex-A57.
I've attached a reproducer.

> llc -o - -mcpu=cortex-a57 reg.ll // Without your patch
	...
	csel	w8, w9, w8, lt
	adrp	x9, d
	cmp	 w8, #100 
	str	w8, [x9, :lo12:d]
	csel	w8, w11, w8, lt
	cmp	 w8, w11, lsl #2
	csel	w8, w11, w8, gt
	str	w8, [x9, :lo12:d]
	adrp	x9, acc
	ldr	x11, [x9, :lo12:acc]
	...

> llc -o - -mcpu=cortex-a57 reg.ll // With your patch
	...
	csel	w8, w9, w8, lt
	cmp	 w8, #100
	csel	w12, w11, w8, lt
	lsl	w13, w11, #2
	adrp	x9, d
	str	w8, [x9, :lo12:d]
	ccmp	w12, w13, #0, ge
	csel	w8, w11, w8, gt
	cmp	 w12, w13
	str	w8, [x9, :lo12:d]
	adrp	x9, acc
	csel	w8, w11, w12, gt
	ldr	x11, [x9, :lo12:acc]
	...

Before your patch, we had 3 x csel + 2 x cmp
With your patch applied, we have 4 x csel + 2 x cmp + 1 x ccmp + 1 x lsl

Could you have a look into this regression ?

Thanks,
Arnaud

> -----Original Message-----
> From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-
> bounces at cs.uiuc.edu] On Behalf Of Matthias Braun
> Sent: 16 July 2015 22:03
> To: llvm-commits at cs.uiuc.edu
> Subject: [llvm] r242436 - AArch64: Implement conditional compare sequence
> matching.
> 
> Author: matze
> Date: Thu Jul 16 15:02:37 2015
> New Revision: 242436
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=242436&view=rev
> Log:
> AArch64: Implement conditional compare sequence matching.
> 
> This is a new iteration of the reverted r238793 /
> http://reviews.llvm.org/D8232 which wrongly assumed that any and/or trees
> can be represented by conditional compare sequences, however there are
> some restrictions to that. This version fixes this and adds comments that
> explain exactly what types of and/or trees can actually be implemented as
> conditional compare sequences.
> 
> Related to http://llvm.org/PR20927, rdar://18326194
> 
> Differential Revision: http://reviews.llvm.org/D10579
> 
> Modified:
>     llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp
>     llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h
>     llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td
>     llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.td
>     llvm/trunk/test/CodeGen/AArch64/arm64-ccmp.ll
> 
> Modified: llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp
> URL: http://llvm.org/viewvc/llvm-
> project/llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp?rev=24243
> 6&r1=242435&r2=242436&view=diff
> ==========================================================
> ====================
> --- llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp (original)
> +++ llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp Thu Jul 16
> +++ 15:02:37 2015
> @@ -76,6 +76,9 @@ cl::opt<bool> EnableAArch64ELFLocalDynam
>      cl::desc("Allow AArch64 Local Dynamic TLS code generation"),
>      cl::init(false));
> 
> +/// Value type used for condition codes.
> +static const MVT MVT_CC = MVT::i32;
> +
>  AArch64TargetLowering::AArch64TargetLowering(const TargetMachine
> &TM,
>                                               const AArch64Subtarget &STI)
>      : TargetLowering(TM), Subtarget(&STI) { @@ -809,6 +812,9 @@ const
char
> *AArch64TargetLowering::getTa
>    case AArch64ISD::ADCS:              return "AArch64ISD::ADCS";
>    case AArch64ISD::SBCS:              return "AArch64ISD::SBCS";
>    case AArch64ISD::ANDS:              return "AArch64ISD::ANDS";
> +  case AArch64ISD::CCMP:              return "AArch64ISD::CCMP";
> +  case AArch64ISD::CCMN:              return "AArch64ISD::CCMN";
> +  case AArch64ISD::FCCMP:             return "AArch64ISD::FCCMP";
>    case AArch64ISD::FCMP:              return "AArch64ISD::FCMP";
>    case AArch64ISD::FMIN:              return "AArch64ISD::FMIN";
>    case AArch64ISD::FMAX:              return "AArch64ISD::FMAX";
> @@ -1167,14 +1173,224 @@ static SDValue emitComparison(SDValue LH
>      LHS = LHS.getOperand(0);
>    }
> 
> -  return DAG.getNode(Opcode, dl, DAG.getVTList(VT, MVT::i32), LHS, RHS)
> +  return DAG.getNode(Opcode, dl, DAG.getVTList(VT, MVT_CC), LHS, RHS)
>        .getValue(1);
>  }
> 
> +/// \defgroup AArch64CCMP CMP;CCMP matching /// /// These functions
> +deal with the formation of CMP;CCMP;... sequences.
> +/// The CCMP/CCMN/FCCMP/FCCMPE instructions allow the conditional
> +execution of /// a comparison. They set the NZCV flags to a predefined
> +value if their /// predicate is false. This allows to express arbitrary
> +conjunctions, for /// example "cmp 0 (and (setCA (cmp A)) (setCB (cmp
> B))))"
> +/// expressed as:
> +///   cmp A
> +///   ccmp B, inv(CB), CA
> +///   check for CB flags
> +///
> +/// In general we can create code for arbitrary "... (and (and A B) C)"
> +/// sequences. We can also implement some "or" expressions, because "(or
> A B)"
> +/// is equivalent to "not (and (not A) (not B))" and we can implement
> +some /// negation operations:
> +/// We can negate the results of a single comparison by inverting the
> +flags /// used when the predicate fails and inverting the flags tested
> +in the next /// instruction; We can also negate the results of the
> +whole previous /// conditional compare sequence by inverting the flags
> +tested in the next /// instruction. However there is no way to negate
> +the result of a partial /// sequence.
> +///
> +/// Therefore on encountering an "or" expression we can negate the
> +subtree on /// one side and have to be able to push the negate to the
> +leafs of the subtree /// on the other side (see also the comments in
code).
> As complete example:
> +/// "or (or (setCA (cmp A)) (setCB (cmp B)))
> +///     (and (setCC (cmp C)) (setCD (cmp D)))"
> +/// is transformed to
> +/// "not (and (not (and (setCC (cmp C)) (setCC (cmp D))))
> +///           (and (not (setCA (cmp A)) (not (setCB (cmp B))))))"
> +/// and implemented as:
> +///   cmp C
> +///   ccmp D, inv(CD), CC
> +///   ccmp A, CA, inv(CD)
> +///   ccmp B, CB, inv(CA)
> +///   check for CB flags
> +/// A counterexample is "or (and A B) (and C D)" which cannot be
> +implemented /// by conditional compare sequences.
> +/// @{
> +
> +/// Create a conditional comparison; Use CCMP, CCMN or FCCMP as
> apropriate.
> +static SDValue emitConditionalComparison(SDValue LHS, SDValue RHS,
> +                                         ISD::CondCode CC, SDValue CCOp,
> +                                         SDValue Condition, unsigned
NZCV,
> +                                         SDLoc DL, SelectionDAG &DAG) {
> +  unsigned Opcode = 0;
> +  if (LHS.getValueType().isFloatingPoint())
> +    Opcode = AArch64ISD::FCCMP;
> +  else if (RHS.getOpcode() == ISD::SUB) {
> +    SDValue SubOp0 = RHS.getOperand(0);
> +    if (const ConstantSDNode *SubOp0C =
> dyn_cast<ConstantSDNode>(SubOp0))
> +      if (SubOp0C->isNullValue() && (CC == ISD::SETEQ || CC ==
ISD::SETNE)) {
> +        // See emitComparison() on why we can only do this for SETEQ and
> SETNE.
> +        Opcode = AArch64ISD::CCMN;
> +        RHS = RHS.getOperand(1);
> +      }
> +  }
> +  if (Opcode == 0)
> +    Opcode = AArch64ISD::CCMP;
> +
> +  SDValue NZCVOp = DAG.getConstant(NZCV, DL, MVT::i32);
> +  return DAG.getNode(Opcode, DL, MVT_CC, LHS, RHS, NZCVOp, Condition,
> +CCOp); }
> +
> +/// Returns true if @p Val is a tree of AND/OR/SETCC operations.
> +/// CanPushNegate is set to true if we can push a negate operation
> +through /// the tree in a was that we are left with AND operations and
> +negate operations /// at the leafs only. i.e. "not (or (or x y) z)" can
> +be changed to /// "and (and (not x) (not y)) (not z)"; "not (or (and x
> +y) z)" cannot be /// brought into such a form.
> +static bool isConjunctionDisjunctionTree(const SDValue Val, bool
> &CanPushNegate,
> +                                         unsigned Depth = 0) {
> +  if (!Val.hasOneUse())
> +    return false;
> +  unsigned Opcode = Val->getOpcode();
> +  if (Opcode == ISD::SETCC) {
> +    CanPushNegate = true;
> +    return true;
> +  }
> +  // Protect against stack overflow.
> +  if (Depth > 15)
> +    return false;
> +  if (Opcode == ISD::AND || Opcode == ISD::OR) {
> +    SDValue O0 = Val->getOperand(0);
> +    SDValue O1 = Val->getOperand(1);
> +    bool CanPushNegateL;
> +    if (!isConjunctionDisjunctionTree(O0, CanPushNegateL, Depth+1))
> +      return false;
> +    bool CanPushNegateR;
> +    if (!isConjunctionDisjunctionTree(O1, CanPushNegateR, Depth+1))
> +      return false;
> +    // We cannot push a negate through an AND operation (it would become
> an OR),
> +    // we can however change a (not (or x y)) to (and (not x) (not y)) if
we can
> +    // push the negate through the x/y subtrees.
> +    CanPushNegate = (Opcode == ISD::OR) && CanPushNegateL &&
> CanPushNegateR;
> +    return true;
> +  }
> +  return false;
> +}
> +
> +/// Emit conjunction or disjunction tree with the CMP/FCMP followed by
> +a chain /// of CCMP/CFCMP ops. See @ref AArch64CCMP.
> +/// Tries to transform the given i1 producing node @p Val to a series
> +compare /// and conditional compare operations. @returns an NZCV flags
> +producing node /// and sets @p OutCC to the flags that should be tested
> +or returns SDValue() if /// transformation was not possible.
> +/// On recursive invocations @p PushNegate may be set to true to have
> +negation /// effects pushed to the tree leafs; @p Predicate is an NZCV
> +flag predicate /// for the comparisons in the current subtree; @p Depth
> +limits the search /// depth to avoid stack overflow.
> +static SDValue emitConjunctionDisjunctionTree(SelectionDAG &DAG,
> SDValue Val,
> +    AArch64CC::CondCode &OutCC, bool PushNegate = false,
> +    SDValue CCOp = SDValue(), AArch64CC::CondCode Predicate =
> AArch64CC::AL,
> +    unsigned Depth = 0) {
> +  // We're at a tree leaf, produce a conditional comparison operation.
> +  unsigned Opcode = Val->getOpcode();
> +  if (Opcode == ISD::SETCC) {
> +    SDValue LHS = Val->getOperand(0);
> +    SDValue RHS = Val->getOperand(1);
> +    ISD::CondCode CC = cast<CondCodeSDNode>(Val->getOperand(2))-
> >get();
> +    bool isInteger = LHS.getValueType().isInteger();
> +    if (PushNegate)
> +      CC = getSetCCInverse(CC, isInteger);
> +    SDLoc DL(Val);
> +    // Determine OutCC and handle FP special case.
> +    if (isInteger) {
> +      OutCC = changeIntCCToAArch64CC(CC);
> +    } else {
> +      assert(LHS.getValueType().isFloatingPoint());
> +      AArch64CC::CondCode ExtraCC;
> +      changeFPCCToAArch64CC(CC, OutCC, ExtraCC);
> +      // Surpisingly some floating point conditions can't be tested with
a
> +      // single condition code. Construct an additional comparison in
this case.
> +      // See comment below on how we deal with OR conditions.
> +      if (ExtraCC != AArch64CC::AL) {
> +        SDValue ExtraCmp;
> +        if (!CCOp.getNode())
> +          ExtraCmp = emitComparison(LHS, RHS, CC, DL, DAG);
> +        else {
> +          SDValue ConditionOp = DAG.getConstant(Predicate, DL, MVT_CC);
> +          // Note that we want the inverse of ExtraCC, so NZCV is not
inversed.
> +          unsigned NZCV = AArch64CC::getNZCVToSatisfyCondCode(ExtraCC);
> +          ExtraCmp = emitConditionalComparison(LHS, RHS, CC, CCOp,
> ConditionOp,
> +                                               NZCV, DL, DAG);
> +        }
> +        CCOp = ExtraCmp;
> +        Predicate = AArch64CC::getInvertedCondCode(ExtraCC);
> +        OutCC = AArch64CC::getInvertedCondCode(OutCC);
> +      }
> +    }
> +
> +    // Produce a normal comparison if we are first in the chain
> +    if (!CCOp.getNode())
> +      return emitComparison(LHS, RHS, CC, DL, DAG);
> +    // Otherwise produce a ccmp.
> +    SDValue ConditionOp = DAG.getConstant(Predicate, DL, MVT_CC);
> +    AArch64CC::CondCode InvOutCC =
> AArch64CC::getInvertedCondCode(OutCC);
> +    unsigned NZCV = AArch64CC::getNZCVToSatisfyCondCode(InvOutCC);
> +    return emitConditionalComparison(LHS, RHS, CC, CCOp, ConditionOp,
> NZCV, DL,
> +                                     DAG);  } else if (Opcode !=
> + ISD::AND && Opcode != ISD::OR)
> +    return SDValue();
> +
> +  assert((Opcode == ISD::OR || !PushNegate)
> +         && "Can only push negate through OR operation");
> +
> +  // Check if both sides can be transformed.
> +  SDValue LHS = Val->getOperand(0);
> +  SDValue RHS = Val->getOperand(1);
> +  bool CanPushNegateL;
> +  if (!isConjunctionDisjunctionTree(LHS, CanPushNegateL, Depth+1))
> +    return SDValue();
> +  bool CanPushNegateR;
> +  if (!isConjunctionDisjunctionTree(RHS, CanPushNegateR, Depth+1))
> +    return SDValue();
> +
> +  // Do we need to negate our operands?
> +  bool NegateOperands = Opcode == ISD::OR;  // We can negate the
> + results of all previous operations by inverting the  // predicate
> + flags giving us a free negation for one side. For the other side  //
> + we need to be able to push the negation to the leafs of the tree.
> +  if (NegateOperands) {
> +    if (!CanPushNegateL && !CanPushNegateR)
> +      return SDValue();
> +    // Order the side where we can push the negate through to LHS.
> +    if (!CanPushNegateL && CanPushNegateR) {
> +      std::swap(LHS, RHS);
> +      CanPushNegateL = true;
> +    }
> +  }
> +
> +  // Emit RHS. If we want to negate the tree we only need to push a
> +negate
> +  // through if we are already in a PushNegate case, otherwise we can
> +negate
> +  // the "flags to test" afterwards.
> +  AArch64CC::CondCode RHSCC;
> +  SDValue CmpR = emitConjunctionDisjunctionTree(DAG, RHS, RHSCC,
> PushNegate,
> +                                                CCOp, Predicate,
> +Depth+1);
> +  if (NegateOperands && !PushNegate)
> +    RHSCC = AArch64CC::getInvertedCondCode(RHSCC);
> +  // Emit LHS. We must push the negate through if we need to negate it.
> +  SDValue CmpL = emitConjunctionDisjunctionTree(DAG, LHS, OutCC,
> NegateOperands,
> +                                                CmpR, RHSCC, Depth+1);
> +  // If we transformed an OR to and AND then we have to negate the
> +result
> +  // (or absorb a PushNegate resulting in a double negation).
> +  if (Opcode == ISD::OR && !PushNegate)
> +    OutCC = AArch64CC::getInvertedCondCode(OutCC);
> +  return CmpL;
> +}
> +
> +/// @}
> +
>  static SDValue getAArch64Cmp(SDValue LHS, SDValue RHS, ISD::CondCode
> CC,
>                               SDValue &AArch64cc, SelectionDAG &DAG, SDLoc
dl) {
> -  SDValue Cmp;
> -  AArch64CC::CondCode AArch64CC;
>    if (ConstantSDNode *RHSC = dyn_cast<ConstantSDNode>(RHS.getNode()))
> {
>      EVT VT = RHS.getValueType();
>      uint64_t C = RHSC->getZExtValue();
> @@ -1229,47 +1445,56 @@ static SDValue getAArch64Cmp(SDValue LHS
>        }
>      }
>    }
> -  // The imm operand of ADDS is an unsigned immediate, in the range 0 to
> 4095.
> -  // For the i8 operand, the largest immediate is 255, so this can be
easily
> -  // encoded in the compare instruction. For the i16 operand, however,
the
> -  // largest immediate cannot be encoded in the compare.
> -  // Therefore, use a sign extending load and cmn to avoid materializing
the -
> 1
> -  // constant. For example,
> -  // movz w1, #65535
> -  // ldrh w0, [x0, #0]
> -  // cmp w0, w1
> -  // >
> -  // ldrsh w0, [x0, #0]
> -  // cmn w0, #1
> -  // Fundamental, we're relying on the property that (zext LHS) == (zext
RHS)
> -  // if and only if (sext LHS) == (sext RHS). The checks are in place to
ensure
> -  // both the LHS and RHS are truely zero extended and to make sure the
> -  // transformation is profitable.
> +  SDValue Cmp;
> +  AArch64CC::CondCode AArch64CC;
>    if ((CC == ISD::SETEQ || CC == ISD::SETNE) && isa<ConstantSDNode>(RHS))
> {
> -    if ((cast<ConstantSDNode>(RHS)->getZExtValue() >> 16 == 0) &&
> -        isa<LoadSDNode>(LHS)) {
> -      if (cast<LoadSDNode>(LHS)->getExtensionType() == ISD::ZEXTLOAD &&
> -          cast<LoadSDNode>(LHS)->getMemoryVT() == MVT::i16 &&
> -          LHS.getNode()->hasNUsesOfValue(1, 0)) {
> -        int16_t ValueofRHS = cast<ConstantSDNode>(RHS)->getZExtValue();
> -        if (ValueofRHS < 0 && isLegalArithImmed(-ValueofRHS)) {
> -          SDValue SExt =
> -              DAG.getNode(ISD::SIGN_EXTEND_INREG, dl, LHS.getValueType(),
> LHS,
> -                          DAG.getValueType(MVT::i16));
> -          Cmp = emitComparison(SExt,
> -                               DAG.getConstant(ValueofRHS, dl,
> -                                               RHS.getValueType()),
> -                               CC, dl, DAG);
> -          AArch64CC = changeIntCCToAArch64CC(CC);
> -          AArch64cc = DAG.getConstant(AArch64CC, dl, MVT::i32);
> -          return Cmp;
> -        }
> +    const ConstantSDNode *RHSC = cast<ConstantSDNode>(RHS);
> +
> +    // The imm operand of ADDS is an unsigned immediate, in the range 0
to
> 4095.
> +    // For the i8 operand, the largest immediate is 255, so this can be
easily
> +    // encoded in the compare instruction. For the i16 operand, however,
the
> +    // largest immediate cannot be encoded in the compare.
> +    // Therefore, use a sign extending load and cmn to avoid
materializing the
> +    // -1 constant. For example,
> +    // movz w1, #65535
> +    // ldrh w0, [x0, #0]
> +    // cmp w0, w1
> +    // >
> +    // ldrsh w0, [x0, #0]
> +    // cmn w0, #1
> +    // Fundamental, we're relying on the property that (zext LHS) ==
(zext
> RHS)
> +    // if and only if (sext LHS) == (sext RHS). The checks are in place
to
> +    // ensure both the LHS and RHS are truely zero extended and to make
> sure the
> +    // transformation is profitable.
> +    if ((RHSC->getZExtValue() >> 16 == 0) && isa<LoadSDNode>(LHS) &&
> +        cast<LoadSDNode>(LHS)->getExtensionType() == ISD::ZEXTLOAD &&
> +        cast<LoadSDNode>(LHS)->getMemoryVT() == MVT::i16 &&
> +        LHS.getNode()->hasNUsesOfValue(1, 0)) {
> +      int16_t ValueofRHS = cast<ConstantSDNode>(RHS)->getZExtValue();
> +      if (ValueofRHS < 0 && isLegalArithImmed(-ValueofRHS)) {
> +        SDValue SExt =
> +            DAG.getNode(ISD::SIGN_EXTEND_INREG, dl, LHS.getValueType(),
> LHS,
> +                        DAG.getValueType(MVT::i16));
> +        Cmp = emitComparison(SExt, DAG.getConstant(ValueofRHS, dl,
> +                                                   RHS.getValueType()),
> +                             CC, dl, DAG);
> +        AArch64CC = changeIntCCToAArch64CC(CC);
> +      }
> +    }
> +
> +    if (!Cmp && (RHSC->isNullValue() || RHSC->isOne())) {
> +      if ((Cmp = emitConjunctionDisjunctionTree(DAG, LHS, AArch64CC))) {
> +        if ((CC == ISD::SETNE) ^ RHSC->isNullValue())
> +          AArch64CC = AArch64CC::getInvertedCondCode(AArch64CC);
>        }
>      }
>    }
> -  Cmp = emitComparison(LHS, RHS, CC, dl, DAG);
> -  AArch64CC = changeIntCCToAArch64CC(CC);
> -  AArch64cc = DAG.getConstant(AArch64CC, dl, MVT::i32);
> +
> +  if (!Cmp) {
> +    Cmp = emitComparison(LHS, RHS, CC, dl, DAG);
> +    AArch64CC = changeIntCCToAArch64CC(CC);  }  AArch64cc =
> + DAG.getConstant(AArch64CC, dl, MVT_CC);
>    return Cmp;
>  }
> 
> @@ -9294,3 +9519,8 @@ bool AArch64TargetLowering::functionArgu
>      Type *Ty, CallingConv::ID CallConv, bool isVarArg) const {
>    return Ty->isArrayTy();
>  }
> +
> +bool
> AArch64TargetLowering::shouldNormalizeToSelectSequence(LLVMContext
> &,
> +                                                            EVT) const
> +{
> +  return false;
> +}
> 
> Modified: llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h
> URL: http://llvm.org/viewvc/llvm-
> project/llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h?rev=242436&
> r1=242435&r2=242436&view=diff
> ==========================================================
> ====================
> --- llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h (original)
> +++ llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h Thu Jul 16
> +++ 15:02:37 2015
> @@ -58,6 +58,11 @@ enum NodeType : unsigned {
>    SBCS,
>    ANDS,
> 
> +  // Conditional compares. Operands: left,right,falsecc,cc,flags  CCMP,
> + CCMN,  FCCMP,
> +
>    // Floating point comparison
>    FCMP,
> 
> @@ -516,6 +521,8 @@ private:
>    bool functionArgumentNeedsConsecutiveRegisters(Type *Ty,
>                                                   CallingConv::ID
CallConv,
>                                                   bool isVarArg) const
override;
> +
> +  bool shouldNormalizeToSelectSequence(LLVMContext &, EVT) const
> + override;
>  };
> 
>  namespace AArch64 {
> 
> Modified: llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td
> URL: http://llvm.org/viewvc/llvm-
> project/llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td?rev=242436
> &r1=242435&r2=242436&view=diff
> ==========================================================
> ====================
> --- llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td (original)
> +++ llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td Thu Jul 16
> +++ 15:02:37 2015
> @@ -525,6 +525,13 @@ def imm0_31 : Operand<i64>, ImmLeaf<i64,
>    let ParserMatchClass = Imm0_31Operand;  }
> 
> +// True if the 32-bit immediate is in the range [0,31] def imm32_0_31 :
> +Operand<i32>, ImmLeaf<i32, [{
> +  return ((uint64_t)Imm) < 32;
> +}]> {
> +  let ParserMatchClass = Imm0_31Operand; }
> +
>  // imm0_15 predicate - True if the immediate is in the range [0,15]  def
> imm0_15 : Operand<i64>, ImmLeaf<i64, [{
>    return ((uint64_t)Imm) < 16;
> @@ -542,7 +549,9 @@ def imm0_7 : Operand<i64>, ImmLeaf<i64,  //
> imm32_0_15 predicate - True if the 32-bit immediate is in the range [0,15]
> def imm32_0_15 : Operand<i32>, ImmLeaf<i32, [{
>    return ((uint32_t)Imm) < 16;
> -}]>;
> +}]> {
> +  let ParserMatchClass = Imm0_15Operand; }
> 
>  // An arithmetic shifter operand:
>  //  {7-6} - shift type: 00 = lsl, 01 = lsr, 10 = asr @@ -2108,9 +2117,12
@@
> multiclass LogicalRegS<bits<2> opc, bit
>  //---
> 
>  let mayLoad = 0, mayStore = 0, hasSideEffects = 0 in -class
> BaseCondSetFlagsImm<bit op, RegisterClass regtype, string asm>
> -    : I<(outs), (ins regtype:$Rn, imm0_31:$imm, imm0_15:$nzcv,
> ccode:$cond),
> -         asm, "\t$Rn, $imm, $nzcv, $cond", "", []>,
> +class BaseCondComparisonImm<bit op, RegisterClass regtype, ImmLeaf
> immtype,
> +                            string mnemonic, SDNode OpNode>
> +    : I<(outs), (ins regtype:$Rn, immtype:$imm, imm32_0_15:$nzcv,
> ccode:$cond),
> +         mnemonic, "\t$Rn, $imm, $nzcv, $cond", "",
> +         [(set NZCV, (OpNode regtype:$Rn, immtype:$imm, (i32 imm:$nzcv),
> +                             (i32 imm:$cond), NZCV))]>,
>        Sched<[WriteI, ReadI]> {
>    let Uses = [NZCV];
>    let Defs = [NZCV];
> @@ -2130,19 +2142,13 @@ class BaseCondSetFlagsImm<bit op, Regist
>    let Inst{3-0}   = nzcv;
>  }
> 
> -multiclass CondSetFlagsImm<bit op, string asm> {
> -  def Wi : BaseCondSetFlagsImm<op, GPR32, asm> {
> -    let Inst{31} = 0;
> -  }
> -  def Xi : BaseCondSetFlagsImm<op, GPR64, asm> {
> -    let Inst{31} = 1;
> -  }
> -}
> -
>  let mayLoad = 0, mayStore = 0, hasSideEffects = 0 in -class
> BaseCondSetFlagsReg<bit op, RegisterClass regtype, string asm>
> -    : I<(outs), (ins regtype:$Rn, regtype:$Rm, imm0_15:$nzcv,
ccode:$cond),
> -         asm, "\t$Rn, $Rm, $nzcv, $cond", "", []>,
> +class BaseCondComparisonReg<bit op, RegisterClass regtype, string
> mnemonic,
> +                            SDNode OpNode>
> +    : I<(outs), (ins regtype:$Rn, regtype:$Rm, imm32_0_15:$nzcv,
> ccode:$cond),
> +         mnemonic, "\t$Rn, $Rm, $nzcv, $cond", "",
> +         [(set NZCV, (OpNode regtype:$Rn, regtype:$Rm, (i32 imm:$nzcv),
> +                             (i32 imm:$cond), NZCV))]>,
>        Sched<[WriteI, ReadI, ReadI]> {
>    let Uses = [NZCV];
>    let Defs = [NZCV];
> @@ -2162,11 +2168,19 @@ class BaseCondSetFlagsReg<bit op, Regist
>    let Inst{3-0}   = nzcv;
>  }
> 
> -multiclass CondSetFlagsReg<bit op, string asm> {
> -  def Wr : BaseCondSetFlagsReg<op, GPR32, asm> {
> +multiclass CondComparison<bit op, string mnemonic, SDNode OpNode> {
> +  // immediate operand variants
> +  def Wi : BaseCondComparisonImm<op, GPR32, imm32_0_31, mnemonic,
> +OpNode> {
>      let Inst{31} = 0;
>    }
> -  def Xr : BaseCondSetFlagsReg<op, GPR64, asm> {
> +  def Xi : BaseCondComparisonImm<op, GPR64, imm0_31, mnemonic,
> OpNode> {
> +    let Inst{31} = 1;
> +  }
> +  // register operand variants
> +  def Wr : BaseCondComparisonReg<op, GPR32, mnemonic, OpNode> {
> +    let Inst{31} = 0;
> +  }
> +  def Xr : BaseCondComparisonReg<op, GPR64, mnemonic, OpNode> {
>      let Inst{31} = 1;
>    }
>  }
> @@ -3974,11 +3988,14 @@ multiclass FPComparison<bit signalAllNan
>  //---
> 
>  let mayLoad = 0, mayStore = 0, hasSideEffects = 0 in -class
> BaseFPCondComparison<bit signalAllNans,
> -                              RegisterClass regtype, string asm>
> -    : I<(outs), (ins regtype:$Rn, regtype:$Rm, imm0_15:$nzcv,
ccode:$cond),
> -         asm, "\t$Rn, $Rm, $nzcv, $cond", "", []>,
> +class BaseFPCondComparison<bit signalAllNans, RegisterClass regtype,
> +                           string mnemonic, list<dag> pat>
> +    : I<(outs), (ins regtype:$Rn, regtype:$Rm, imm32_0_15:$nzcv,
> ccode:$cond),
> +         mnemonic, "\t$Rn, $Rm, $nzcv, $cond", "", pat>,
>        Sched<[WriteFCmp]> {
> +  let Uses = [NZCV];
> +  let Defs = [NZCV];
> +
>    bits<5> Rn;
>    bits<5> Rm;
>    bits<4> nzcv;
> @@ -3994,16 +4011,18 @@ class BaseFPCondComparison<bit signalAll
>    let Inst{3-0}   = nzcv;
>  }
> 
> -multiclass FPCondComparison<bit signalAllNans, string asm> {
> -  let Defs = [NZCV], Uses = [NZCV] in {
> -  def Srr : BaseFPCondComparison<signalAllNans, FPR32, asm> {
> +multiclass FPCondComparison<bit signalAllNans, string mnemonic,
> +                            SDPatternOperator OpNode = null_frag> {
> +  def Srr : BaseFPCondComparison<signalAllNans, FPR32, mnemonic,
> +      [(set NZCV, (OpNode (f32 FPR32:$Rn), (f32 FPR32:$Rm), (i32
imm:$nzcv),
> +                          (i32 imm:$cond), NZCV))]> {
>      let Inst{22} = 0;
>    }
> -
> -  def Drr : BaseFPCondComparison<signalAllNans, FPR64, asm> {
> +  def Drr : BaseFPCondComparison<signalAllNans, FPR64, mnemonic,
> +      [(set NZCV, (OpNode (f64 FPR64:$Rn), (f64 FPR64:$Rm), (i32
imm:$nzcv),
> +                          (i32 imm:$cond), NZCV))]> {
>      let Inst{22} = 1;
>    }
> -  } // Defs = [NZCV], Uses = [NZCV]
>  }
> 
>  //---
> 
> Modified: llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.td
> URL: http://llvm.org/viewvc/llvm-
> project/llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.td?rev=242436&r1=
> 242435&r2=242436&view=diff
> ==========================================================
> ====================
> --- llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.td (original)
> +++ llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.td Thu Jul 16
> +++ 15:02:37 2015
> @@ -66,6 +66,20 @@ def SDT_AArch64CSel  : SDTypeProfile<1,
>                                      SDTCisSameAs<0, 2>,
>                                      SDTCisInt<3>,
>                                      SDTCisVT<4, i32>]>;
> +def SDT_AArch64CCMP : SDTypeProfile<1, 5,
> +                                    [SDTCisVT<0, i32>,
> +                                     SDTCisInt<1>,
> +                                     SDTCisSameAs<1, 2>,
> +                                     SDTCisInt<3>,
> +                                     SDTCisInt<4>,
> +                                     SDTCisVT<5, i32>]>; def
> +SDT_AArch64FCCMP : SDTypeProfile<1, 5,
> +                                     [SDTCisVT<0, i32>,
> +                                      SDTCisFP<1>,
> +                                      SDTCisSameAs<1, 2>,
> +                                      SDTCisInt<3>,
> +                                      SDTCisInt<4>,
> +                                      SDTCisVT<5, i32>]>;
>  def SDT_AArch64FCmp   : SDTypeProfile<0, 2,
>                                     [SDTCisFP<0>,
>                                      SDTCisSameAs<0, 1>]>; @@ -160,6
+174,10 @@ def
> AArch64and_flag  : SDNode<"AArch64IS  def AArch64adc_flag  :
> SDNode<"AArch64ISD::ADCS",  SDTBinaryArithWithFlagsInOut>;  def
> AArch64sbc_flag  : SDNode<"AArch64ISD::SBCS",
> SDTBinaryArithWithFlagsInOut>;
> 
> +def AArch64ccmp      : SDNode<"AArch64ISD::CCMP",  SDT_AArch64CCMP>;
> +def AArch64ccmn      : SDNode<"AArch64ISD::CCMN",  SDT_AArch64CCMP>;
> +def AArch64fccmp     : SDNode<"AArch64ISD::FCCMP",
> SDT_AArch64FCCMP>;
> +
>  def AArch64threadpointer : SDNode<"AArch64ISD::THREAD_POINTER",
> SDTPtrLeaf>;
> 
>  def AArch64fcmp      : SDNode<"AArch64ISD::FCMP", SDT_AArch64FCmp>;
> @@ -1020,13 +1038,10 @@ def : InstAlias<"uxth $dst, $src", (UBFM  def :
> InstAlias<"uxtw $dst, $src", (UBFMXri GPR64:$dst, GPR64:$src, 0, 31)>;
> 
>
//===----------------------------------------------------------------------=
==//
> -// Conditionally set flags instructions.
> +// Conditional comparison instructions.
>
//===----------------------------------------------------------------------=
==//
> -defm CCMN : CondSetFlagsImm<0, "ccmn">; -defm CCMP :
> CondSetFlagsImm<1, "ccmp">;
> -
> -defm CCMN : CondSetFlagsReg<0, "ccmn">; -defm CCMP :
> CondSetFlagsReg<1, "ccmp">;
> +defm CCMN : CondComparison<0, "ccmn", AArch64ccmn>; defm CCMP :
> +CondComparison<1, "ccmp", AArch64ccmp>;
> 
>
//===----------------------------------------------------------------------=
==//
>  // Conditional select instructions.
> @@ -2556,7 +2571,7 @@ defm FCMP  : FPComparison<0, "fcmp", AAr  //===-
> ---------------------------------------------------------------------===//
> 
>  defm FCCMPE : FPCondComparison<1, "fccmpe">; -defm FCCMP  :
> FPCondComparison<0, "fccmp">;
> +defm FCCMP  : FPCondComparison<0, "fccmp", AArch64fccmp>;
> 
>
//===----------------------------------------------------------------------=
==//
>  // Floating point conditional select instruction.
> 
> Modified: llvm/trunk/test/CodeGen/AArch64/arm64-ccmp.ll
> URL: http://llvm.org/viewvc/llvm-
> project/llvm/trunk/test/CodeGen/AArch64/arm64-
> ccmp.ll?rev=242436&r1=242435&r2=242436&view=diff
> ==========================================================
> ====================
> --- llvm/trunk/test/CodeGen/AArch64/arm64-ccmp.ll (original)
> +++ llvm/trunk/test/CodeGen/AArch64/arm64-ccmp.ll Thu Jul 16 15:02:37
> +++ 2015
> @@ -287,3 +287,99 @@ sw.bb.i.i:
>    %code1.i.i.phi.trans.insert = getelementptr inbounds %str1, %str1* %0,
i64
> 0, i32 0, i32 0, i64 16
>    br label %sw.bb.i.i
>  }
> +
> +; CHECK-LABEL: select_and
> +define i64 @select_and(i32 %w0, i32 %w1, i64 %x2, i64 %x3) { ; CHECK:
> +cmp w1, #5 ; CHECK-NEXT: ccmp w0, w1, #0, ne ; CHECK-NEXT: csel x0, x2,
> +x3, lt ; CHECK-NEXT: ret
> +  %1 = icmp slt i32 %w0, %w1
> +  %2 = icmp ne i32 5, %w1
> +  %3 = and i1 %1, %2
> +  %sel = select i1 %3, i64 %x2, i64 %x3
> +  ret i64 %sel
> +}
> +
> +; CHECK-LABEL: select_or
> +define i64 @select_or(i32 %w0, i32 %w1, i64 %x2, i64 %x3) { ; CHECK:
> +cmp w1, #5 ; CHECK-NEXT: ccmp w0, w1, #8, eq ; CHECK-NEXT: csel x0, x2,
> +x3, lt ; CHECK-NEXT: ret
> +  %1 = icmp slt i32 %w0, %w1
> +  %2 = icmp ne i32 5, %w1
> +  %3 = or i1 %1, %2
> +  %sel = select i1 %3, i64 %x2, i64 %x3
> +  ret i64 %sel
> +}
> +
> +; CHECK-LABEL: select_complicated
> +define i16 @select_complicated(double %v1, double %v2, i16 %a, i16 %b)
> +{ ; CHECK: ldr [[REG:d[0-9]+]], ; CHECK: fcmp d0, d2 ; CHECK-NEXT: fmov
> +d2, #13.00000000 ; CHECK-NEXT: fccmp d1, d2, #4, ne ; CHECK-NEXT: fccmp
> +d0, d1, #1, ne ; CHECK-NEXT: fccmp d0, d1, #4, vc ; CEHCK-NEXT: csel
> +w0, w0, w1, eq
> +  %1 = fcmp one double %v1, %v2
> +  %2 = fcmp oeq double %v2, 13.0
> +  %3 = fcmp oeq double %v1, 42.0
> +  %or0 = or i1 %2, %3
> +  %or1 = or i1 %1, %or0
> +  %sel = select i1 %or1, i16 %a, i16 %b
> +  ret i16 %sel
> +}
> +
> +; CHECK-LABEL: gccbug
> +define i64 @gccbug(i64 %x0, i64 %x1) {
> +; CHECK: cmp x1, #0
> +; CHECK-NEXT: ccmp x0, #2, #0, eq
> +; CHECK-NEXT: ccmp x0, #4, #4, ne
> +; CHECK-NEXT: orr w[[REGNUM:[0-9]+]], wzr, #0x1 ; CHECK-NEXT: cinc x0,
> +x[[REGNUM]], eq ; CHECK-NEXT: ret
> +  %cmp0 = icmp eq i64 %x1, 0
> +  %cmp1 = icmp eq i64 %x0, 2
> +  %cmp2 = icmp eq i64 %x0, 4
> +
> +  %or = or i1 %cmp2, %cmp1
> +  %and = and i1 %or, %cmp0
> +
> +  %sel = select i1 %and, i64 2, i64 1
> +  ret i64 %sel
> +}
> +
> +; CHECK-LABEL: select_ororand
> +define i32 @select_ororand(i32 %w0, i32 %w1, i32 %w2, i32 %w3) { ;
> +CHECK: cmp w3, #4 ; CHECK-NEXT: ccmp w2, #2, #0, gt ; CHECK-NEXT: ccmp
> +w1, #13, #2, ge ; CHECK-NEXT: ccmp w0, #0, #4, ls ; CHECK-NEXT: csel
> +w0, w3, wzr, eq ; CHECK-NEXT: ret
> +  %c0 = icmp eq i32 %w0, 0
> +  %c1 = icmp ugt i32 %w1, 13
> +  %c2 = icmp slt i32 %w2, 2
> +  %c4 = icmp sgt i32 %w3, 4
> +  %or = or i1 %c0, %c1
> +  %and = and i1 %c2, %c4
> +  %or1 = or i1 %or, %and
> +  %sel = select i1 %or1, i32 %w3, i32 0
> +  ret i32 %sel
> +}
> +
> +; CHECK-LABEL: select_noccmp
> +define i64 @select_noccmp(i64 %v1, i64 %v2, i64 %v3, i64 %r) { ;
> +CHECK-NOT: CCMP
> +  %c0 = icmp slt i64 %v1, 0
> +  %c1 = icmp sgt i64 %v1, 13
> +  %c2 = icmp slt i64 %v3, 2
> +  %c4 = icmp sgt i64 %v3, 4
> +  %and0 = and i1 %c0, %c1
> +  %and1 = and i1 %c2, %c4
> +  %or = or i1 %and0, %and1
> +  %sel = select i1 %or, i64 0, i64 %r
> +  ret i64 %sel
> +}
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-------------- next part --------------
A non-text attachment was scrubbed...
Name: reg.ll
Type: application/octet-stream
Size: 1805 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150717/92520417/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: reg.r242236.s
Type: application/octet-stream
Size: 1111 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150717/92520417/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: reg.s
Type: application/octet-stream
Size: 1053 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150717/92520417/attachment-0002.obj>


More information about the llvm-commits mailing list