[llvm] r242436 - AArch64: Implement conditional compare sequence matching.
Matthias Braun
matze at braunis.de
Fri Jul 17 18:44:49 PDT 2015
I just investigated it and this is a tricky one...
Before the ccmp change we were normalizing like this in the SelectionDAG (llvm IR normalizes into the other direction):
select(C0|C1, A, B) => select(C0, A, select(C1, A, B)).
In this particular case you are extra lucky if you normalize like this because it turns out that the "select(C1, A, B)" already exists and can be CSEd.
With CMP;CCMP available we generally want to normalize into the other direction and unfortunately don't see the CSE opportunity anymore.
I think it should be possible to match for this specific situation in llvm InstCombine already. I hope the benchmark is important enough to warrant a bunch of strange extra code in InstCombine...
- Matthias
> On Jul 17, 2015, at 8:39 AM, Arnaud A. de Grandmaison <arnaud.degrandmaison at arm.com> wrote:
>
> Hi Matthias,
>
> Thanks for working on this ! It brings some nice improvements. We however
> found a 11% regression in one of our performance benchmark on cortex-A57.
> I've attached a reproducer.
>
>> llc -o - -mcpu=cortex-a57 reg.ll // Without your patch
> ...
> csel w8, w9, w8, lt
> adrp x9, d
> cmp w8, #100
> str w8, [x9, :lo12:d]
> csel w8, w11, w8, lt
> cmp w8, w11, lsl #2
> csel w8, w11, w8, gt
> str w8, [x9, :lo12:d]
> adrp x9, acc
> ldr x11, [x9, :lo12:acc]
> ...
>
>> llc -o - -mcpu=cortex-a57 reg.ll // With your patch
> ...
> csel w8, w9, w8, lt
> cmp w8, #100
> csel w12, w11, w8, lt
> lsl w13, w11, #2
> adrp x9, d
> str w8, [x9, :lo12:d]
> ccmp w12, w13, #0, ge
> csel w8, w11, w8, gt
> cmp w12, w13
> str w8, [x9, :lo12:d]
> adrp x9, acc
> csel w8, w11, w12, gt
> ldr x11, [x9, :lo12:acc]
> ...
>
> Before your patch, we had 3 x csel + 2 x cmp
> With your patch applied, we have 4 x csel + 2 x cmp + 1 x ccmp + 1 x lsl
>
> Could you have a look into this regression ?
>
> Thanks,
> Arnaud
>
>> -----Original Message-----
>> From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-
>> bounces at cs.uiuc.edu] On Behalf Of Matthias Braun
>> Sent: 16 July 2015 22:03
>> To: llvm-commits at cs.uiuc.edu
>> Subject: [llvm] r242436 - AArch64: Implement conditional compare sequence
>> matching.
>>
>> Author: matze
>> Date: Thu Jul 16 15:02:37 2015
>> New Revision: 242436
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=242436&view=rev
>> Log:
>> AArch64: Implement conditional compare sequence matching.
>>
>> This is a new iteration of the reverted r238793 /
>> http://reviews.llvm.org/D8232 which wrongly assumed that any and/or trees
>> can be represented by conditional compare sequences, however there are
>> some restrictions to that. This version fixes this and adds comments that
>> explain exactly what types of and/or trees can actually be implemented as
>> conditional compare sequences.
>>
>> Related to http://llvm.org/PR20927, rdar://18326194
>>
>> Differential Revision: http://reviews.llvm.org/D10579
>>
>> Modified:
>> llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp
>> llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h
>> llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td
>> llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.td
>> llvm/trunk/test/CodeGen/AArch64/arm64-ccmp.ll
>>
>> Modified: llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp
>> URL: http://llvm.org/viewvc/llvm-
>> project/llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp?rev=24243
>> 6&r1=242435&r2=242436&view=diff
>> ==========================================================
>> ====================
>> --- llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp (original)
>> +++ llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp Thu Jul 16
>> +++ 15:02:37 2015
>> @@ -76,6 +76,9 @@ cl::opt<bool> EnableAArch64ELFLocalDynam
>> cl::desc("Allow AArch64 Local Dynamic TLS code generation"),
>> cl::init(false));
>>
>> +/// Value type used for condition codes.
>> +static const MVT MVT_CC = MVT::i32;
>> +
>> AArch64TargetLowering::AArch64TargetLowering(const TargetMachine
>> &TM,
>> const AArch64Subtarget &STI)
>> : TargetLowering(TM), Subtarget(&STI) { @@ -809,6 +812,9 @@ const
> char
>> *AArch64TargetLowering::getTa
>> case AArch64ISD::ADCS: return "AArch64ISD::ADCS";
>> case AArch64ISD::SBCS: return "AArch64ISD::SBCS";
>> case AArch64ISD::ANDS: return "AArch64ISD::ANDS";
>> + case AArch64ISD::CCMP: return "AArch64ISD::CCMP";
>> + case AArch64ISD::CCMN: return "AArch64ISD::CCMN";
>> + case AArch64ISD::FCCMP: return "AArch64ISD::FCCMP";
>> case AArch64ISD::FCMP: return "AArch64ISD::FCMP";
>> case AArch64ISD::FMIN: return "AArch64ISD::FMIN";
>> case AArch64ISD::FMAX: return "AArch64ISD::FMAX";
>> @@ -1167,14 +1173,224 @@ static SDValue emitComparison(SDValue LH
>> LHS = LHS.getOperand(0);
>> }
>>
>> - return DAG.getNode(Opcode, dl, DAG.getVTList(VT, MVT::i32), LHS, RHS)
>> + return DAG.getNode(Opcode, dl, DAG.getVTList(VT, MVT_CC), LHS, RHS)
>> .getValue(1);
>> }
>>
>> +/// \defgroup AArch64CCMP CMP;CCMP matching /// /// These functions
>> +deal with the formation of CMP;CCMP;... sequences.
>> +/// The CCMP/CCMN/FCCMP/FCCMPE instructions allow the conditional
>> +execution of /// a comparison. They set the NZCV flags to a predefined
>> +value if their /// predicate is false. This allows to express arbitrary
>> +conjunctions, for /// example "cmp 0 (and (setCA (cmp A)) (setCB (cmp
>> B))))"
>> +/// expressed as:
>> +/// cmp A
>> +/// ccmp B, inv(CB), CA
>> +/// check for CB flags
>> +///
>> +/// In general we can create code for arbitrary "... (and (and A B) C)"
>> +/// sequences. We can also implement some "or" expressions, because "(or
>> A B)"
>> +/// is equivalent to "not (and (not A) (not B))" and we can implement
>> +some /// negation operations:
>> +/// We can negate the results of a single comparison by inverting the
>> +flags /// used when the predicate fails and inverting the flags tested
>> +in the next /// instruction; We can also negate the results of the
>> +whole previous /// conditional compare sequence by inverting the flags
>> +tested in the next /// instruction. However there is no way to negate
>> +the result of a partial /// sequence.
>> +///
>> +/// Therefore on encountering an "or" expression we can negate the
>> +subtree on /// one side and have to be able to push the negate to the
>> +leafs of the subtree /// on the other side (see also the comments in
> code).
>> As complete example:
>> +/// "or (or (setCA (cmp A)) (setCB (cmp B)))
>> +/// (and (setCC (cmp C)) (setCD (cmp D)))"
>> +/// is transformed to
>> +/// "not (and (not (and (setCC (cmp C)) (setCC (cmp D))))
>> +/// (and (not (setCA (cmp A)) (not (setCB (cmp B))))))"
>> +/// and implemented as:
>> +/// cmp C
>> +/// ccmp D, inv(CD), CC
>> +/// ccmp A, CA, inv(CD)
>> +/// ccmp B, CB, inv(CA)
>> +/// check for CB flags
>> +/// A counterexample is "or (and A B) (and C D)" which cannot be
>> +implemented /// by conditional compare sequences.
>> +/// @{
>> +
>> +/// Create a conditional comparison; Use CCMP, CCMN or FCCMP as
>> apropriate.
>> +static SDValue emitConditionalComparison(SDValue LHS, SDValue RHS,
>> + ISD::CondCode CC, SDValue CCOp,
>> + SDValue Condition, unsigned
> NZCV,
>> + SDLoc DL, SelectionDAG &DAG) {
>> + unsigned Opcode = 0;
>> + if (LHS.getValueType().isFloatingPoint())
>> + Opcode = AArch64ISD::FCCMP;
>> + else if (RHS.getOpcode() == ISD::SUB) {
>> + SDValue SubOp0 = RHS.getOperand(0);
>> + if (const ConstantSDNode *SubOp0C =
>> dyn_cast<ConstantSDNode>(SubOp0))
>> + if (SubOp0C->isNullValue() && (CC == ISD::SETEQ || CC ==
> ISD::SETNE)) {
>> + // See emitComparison() on why we can only do this for SETEQ and
>> SETNE.
>> + Opcode = AArch64ISD::CCMN;
>> + RHS = RHS.getOperand(1);
>> + }
>> + }
>> + if (Opcode == 0)
>> + Opcode = AArch64ISD::CCMP;
>> +
>> + SDValue NZCVOp = DAG.getConstant(NZCV, DL, MVT::i32);
>> + return DAG.getNode(Opcode, DL, MVT_CC, LHS, RHS, NZCVOp, Condition,
>> +CCOp); }
>> +
>> +/// Returns true if @p Val is a tree of AND/OR/SETCC operations.
>> +/// CanPushNegate is set to true if we can push a negate operation
>> +through /// the tree in a was that we are left with AND operations and
>> +negate operations /// at the leafs only. i.e. "not (or (or x y) z)" can
>> +be changed to /// "and (and (not x) (not y)) (not z)"; "not (or (and x
>> +y) z)" cannot be /// brought into such a form.
>> +static bool isConjunctionDisjunctionTree(const SDValue Val, bool
>> &CanPushNegate,
>> + unsigned Depth = 0) {
>> + if (!Val.hasOneUse())
>> + return false;
>> + unsigned Opcode = Val->getOpcode();
>> + if (Opcode == ISD::SETCC) {
>> + CanPushNegate = true;
>> + return true;
>> + }
>> + // Protect against stack overflow.
>> + if (Depth > 15)
>> + return false;
>> + if (Opcode == ISD::AND || Opcode == ISD::OR) {
>> + SDValue O0 = Val->getOperand(0);
>> + SDValue O1 = Val->getOperand(1);
>> + bool CanPushNegateL;
>> + if (!isConjunctionDisjunctionTree(O0, CanPushNegateL, Depth+1))
>> + return false;
>> + bool CanPushNegateR;
>> + if (!isConjunctionDisjunctionTree(O1, CanPushNegateR, Depth+1))
>> + return false;
>> + // We cannot push a negate through an AND operation (it would become
>> an OR),
>> + // we can however change a (not (or x y)) to (and (not x) (not y)) if
> we can
>> + // push the negate through the x/y subtrees.
>> + CanPushNegate = (Opcode == ISD::OR) && CanPushNegateL &&
>> CanPushNegateR;
>> + return true;
>> + }
>> + return false;
>> +}
>> +
>> +/// Emit conjunction or disjunction tree with the CMP/FCMP followed by
>> +a chain /// of CCMP/CFCMP ops. See @ref AArch64CCMP.
>> +/// Tries to transform the given i1 producing node @p Val to a series
>> +compare /// and conditional compare operations. @returns an NZCV flags
>> +producing node /// and sets @p OutCC to the flags that should be tested
>> +or returns SDValue() if /// transformation was not possible.
>> +/// On recursive invocations @p PushNegate may be set to true to have
>> +negation /// effects pushed to the tree leafs; @p Predicate is an NZCV
>> +flag predicate /// for the comparisons in the current subtree; @p Depth
>> +limits the search /// depth to avoid stack overflow.
>> +static SDValue emitConjunctionDisjunctionTree(SelectionDAG &DAG,
>> SDValue Val,
>> + AArch64CC::CondCode &OutCC, bool PushNegate = false,
>> + SDValue CCOp = SDValue(), AArch64CC::CondCode Predicate =
>> AArch64CC::AL,
>> + unsigned Depth = 0) {
>> + // We're at a tree leaf, produce a conditional comparison operation.
>> + unsigned Opcode = Val->getOpcode();
>> + if (Opcode == ISD::SETCC) {
>> + SDValue LHS = Val->getOperand(0);
>> + SDValue RHS = Val->getOperand(1);
>> + ISD::CondCode CC = cast<CondCodeSDNode>(Val->getOperand(2))-
>>> get();
>> + bool isInteger = LHS.getValueType().isInteger();
>> + if (PushNegate)
>> + CC = getSetCCInverse(CC, isInteger);
>> + SDLoc DL(Val);
>> + // Determine OutCC and handle FP special case.
>> + if (isInteger) {
>> + OutCC = changeIntCCToAArch64CC(CC);
>> + } else {
>> + assert(LHS.getValueType().isFloatingPoint());
>> + AArch64CC::CondCode ExtraCC;
>> + changeFPCCToAArch64CC(CC, OutCC, ExtraCC);
>> + // Surpisingly some floating point conditions can't be tested with
> a
>> + // single condition code. Construct an additional comparison in
> this case.
>> + // See comment below on how we deal with OR conditions.
>> + if (ExtraCC != AArch64CC::AL) {
>> + SDValue ExtraCmp;
>> + if (!CCOp.getNode())
>> + ExtraCmp = emitComparison(LHS, RHS, CC, DL, DAG);
>> + else {
>> + SDValue ConditionOp = DAG.getConstant(Predicate, DL, MVT_CC);
>> + // Note that we want the inverse of ExtraCC, so NZCV is not
> inversed.
>> + unsigned NZCV = AArch64CC::getNZCVToSatisfyCondCode(ExtraCC);
>> + ExtraCmp = emitConditionalComparison(LHS, RHS, CC, CCOp,
>> ConditionOp,
>> + NZCV, DL, DAG);
>> + }
>> + CCOp = ExtraCmp;
>> + Predicate = AArch64CC::getInvertedCondCode(ExtraCC);
>> + OutCC = AArch64CC::getInvertedCondCode(OutCC);
>> + }
>> + }
>> +
>> + // Produce a normal comparison if we are first in the chain
>> + if (!CCOp.getNode())
>> + return emitComparison(LHS, RHS, CC, DL, DAG);
>> + // Otherwise produce a ccmp.
>> + SDValue ConditionOp = DAG.getConstant(Predicate, DL, MVT_CC);
>> + AArch64CC::CondCode InvOutCC =
>> AArch64CC::getInvertedCondCode(OutCC);
>> + unsigned NZCV = AArch64CC::getNZCVToSatisfyCondCode(InvOutCC);
>> + return emitConditionalComparison(LHS, RHS, CC, CCOp, ConditionOp,
>> NZCV, DL,
>> + DAG); } else if (Opcode !=
>> + ISD::AND && Opcode != ISD::OR)
>> + return SDValue();
>> +
>> + assert((Opcode == ISD::OR || !PushNegate)
>> + && "Can only push negate through OR operation");
>> +
>> + // Check if both sides can be transformed.
>> + SDValue LHS = Val->getOperand(0);
>> + SDValue RHS = Val->getOperand(1);
>> + bool CanPushNegateL;
>> + if (!isConjunctionDisjunctionTree(LHS, CanPushNegateL, Depth+1))
>> + return SDValue();
>> + bool CanPushNegateR;
>> + if (!isConjunctionDisjunctionTree(RHS, CanPushNegateR, Depth+1))
>> + return SDValue();
>> +
>> + // Do we need to negate our operands?
>> + bool NegateOperands = Opcode == ISD::OR; // We can negate the
>> + results of all previous operations by inverting the // predicate
>> + flags giving us a free negation for one side. For the other side //
>> + we need to be able to push the negation to the leafs of the tree.
>> + if (NegateOperands) {
>> + if (!CanPushNegateL && !CanPushNegateR)
>> + return SDValue();
>> + // Order the side where we can push the negate through to LHS.
>> + if (!CanPushNegateL && CanPushNegateR) {
>> + std::swap(LHS, RHS);
>> + CanPushNegateL = true;
>> + }
>> + }
>> +
>> + // Emit RHS. If we want to negate the tree we only need to push a
>> +negate
>> + // through if we are already in a PushNegate case, otherwise we can
>> +negate
>> + // the "flags to test" afterwards.
>> + AArch64CC::CondCode RHSCC;
>> + SDValue CmpR = emitConjunctionDisjunctionTree(DAG, RHS, RHSCC,
>> PushNegate,
>> + CCOp, Predicate,
>> +Depth+1);
>> + if (NegateOperands && !PushNegate)
>> + RHSCC = AArch64CC::getInvertedCondCode(RHSCC);
>> + // Emit LHS. We must push the negate through if we need to negate it.
>> + SDValue CmpL = emitConjunctionDisjunctionTree(DAG, LHS, OutCC,
>> NegateOperands,
>> + CmpR, RHSCC, Depth+1);
>> + // If we transformed an OR to and AND then we have to negate the
>> +result
>> + // (or absorb a PushNegate resulting in a double negation).
>> + if (Opcode == ISD::OR && !PushNegate)
>> + OutCC = AArch64CC::getInvertedCondCode(OutCC);
>> + return CmpL;
>> +}
>> +
>> +/// @}
>> +
>> static SDValue getAArch64Cmp(SDValue LHS, SDValue RHS, ISD::CondCode
>> CC,
>> SDValue &AArch64cc, SelectionDAG &DAG, SDLoc
> dl) {
>> - SDValue Cmp;
>> - AArch64CC::CondCode AArch64CC;
>> if (ConstantSDNode *RHSC = dyn_cast<ConstantSDNode>(RHS.getNode()))
>> {
>> EVT VT = RHS.getValueType();
>> uint64_t C = RHSC->getZExtValue();
>> @@ -1229,47 +1445,56 @@ static SDValue getAArch64Cmp(SDValue LHS
>> }
>> }
>> }
>> - // The imm operand of ADDS is an unsigned immediate, in the range 0 to
>> 4095.
>> - // For the i8 operand, the largest immediate is 255, so this can be
> easily
>> - // encoded in the compare instruction. For the i16 operand, however,
> the
>> - // largest immediate cannot be encoded in the compare.
>> - // Therefore, use a sign extending load and cmn to avoid materializing
> the -
>> 1
>> - // constant. For example,
>> - // movz w1, #65535
>> - // ldrh w0, [x0, #0]
>> - // cmp w0, w1
>> - // >
>> - // ldrsh w0, [x0, #0]
>> - // cmn w0, #1
>> - // Fundamental, we're relying on the property that (zext LHS) == (zext
> RHS)
>> - // if and only if (sext LHS) == (sext RHS). The checks are in place to
> ensure
>> - // both the LHS and RHS are truely zero extended and to make sure the
>> - // transformation is profitable.
>> + SDValue Cmp;
>> + AArch64CC::CondCode AArch64CC;
>> if ((CC == ISD::SETEQ || CC == ISD::SETNE) && isa<ConstantSDNode>(RHS))
>> {
>> - if ((cast<ConstantSDNode>(RHS)->getZExtValue() >> 16 == 0) &&
>> - isa<LoadSDNode>(LHS)) {
>> - if (cast<LoadSDNode>(LHS)->getExtensionType() == ISD::ZEXTLOAD &&
>> - cast<LoadSDNode>(LHS)->getMemoryVT() == MVT::i16 &&
>> - LHS.getNode()->hasNUsesOfValue(1, 0)) {
>> - int16_t ValueofRHS = cast<ConstantSDNode>(RHS)->getZExtValue();
>> - if (ValueofRHS < 0 && isLegalArithImmed(-ValueofRHS)) {
>> - SDValue SExt =
>> - DAG.getNode(ISD::SIGN_EXTEND_INREG, dl, LHS.getValueType(),
>> LHS,
>> - DAG.getValueType(MVT::i16));
>> - Cmp = emitComparison(SExt,
>> - DAG.getConstant(ValueofRHS, dl,
>> - RHS.getValueType()),
>> - CC, dl, DAG);
>> - AArch64CC = changeIntCCToAArch64CC(CC);
>> - AArch64cc = DAG.getConstant(AArch64CC, dl, MVT::i32);
>> - return Cmp;
>> - }
>> + const ConstantSDNode *RHSC = cast<ConstantSDNode>(RHS);
>> +
>> + // The imm operand of ADDS is an unsigned immediate, in the range 0
> to
>> 4095.
>> + // For the i8 operand, the largest immediate is 255, so this can be
> easily
>> + // encoded in the compare instruction. For the i16 operand, however,
> the
>> + // largest immediate cannot be encoded in the compare.
>> + // Therefore, use a sign extending load and cmn to avoid
> materializing the
>> + // -1 constant. For example,
>> + // movz w1, #65535
>> + // ldrh w0, [x0, #0]
>> + // cmp w0, w1
>> + // >
>> + // ldrsh w0, [x0, #0]
>> + // cmn w0, #1
>> + // Fundamental, we're relying on the property that (zext LHS) ==
> (zext
>> RHS)
>> + // if and only if (sext LHS) == (sext RHS). The checks are in place
> to
>> + // ensure both the LHS and RHS are truely zero extended and to make
>> sure the
>> + // transformation is profitable.
>> + if ((RHSC->getZExtValue() >> 16 == 0) && isa<LoadSDNode>(LHS) &&
>> + cast<LoadSDNode>(LHS)->getExtensionType() == ISD::ZEXTLOAD &&
>> + cast<LoadSDNode>(LHS)->getMemoryVT() == MVT::i16 &&
>> + LHS.getNode()->hasNUsesOfValue(1, 0)) {
>> + int16_t ValueofRHS = cast<ConstantSDNode>(RHS)->getZExtValue();
>> + if (ValueofRHS < 0 && isLegalArithImmed(-ValueofRHS)) {
>> + SDValue SExt =
>> + DAG.getNode(ISD::SIGN_EXTEND_INREG, dl, LHS.getValueType(),
>> LHS,
>> + DAG.getValueType(MVT::i16));
>> + Cmp = emitComparison(SExt, DAG.getConstant(ValueofRHS, dl,
>> + RHS.getValueType()),
>> + CC, dl, DAG);
>> + AArch64CC = changeIntCCToAArch64CC(CC);
>> + }
>> + }
>> +
>> + if (!Cmp && (RHSC->isNullValue() || RHSC->isOne())) {
>> + if ((Cmp = emitConjunctionDisjunctionTree(DAG, LHS, AArch64CC))) {
>> + if ((CC == ISD::SETNE) ^ RHSC->isNullValue())
>> + AArch64CC = AArch64CC::getInvertedCondCode(AArch64CC);
>> }
>> }
>> }
>> - Cmp = emitComparison(LHS, RHS, CC, dl, DAG);
>> - AArch64CC = changeIntCCToAArch64CC(CC);
>> - AArch64cc = DAG.getConstant(AArch64CC, dl, MVT::i32);
>> +
>> + if (!Cmp) {
>> + Cmp = emitComparison(LHS, RHS, CC, dl, DAG);
>> + AArch64CC = changeIntCCToAArch64CC(CC); } AArch64cc =
>> + DAG.getConstant(AArch64CC, dl, MVT_CC);
>> return Cmp;
>> }
>>
>> @@ -9294,3 +9519,8 @@ bool AArch64TargetLowering::functionArgu
>> Type *Ty, CallingConv::ID CallConv, bool isVarArg) const {
>> return Ty->isArrayTy();
>> }
>> +
>> +bool
>> AArch64TargetLowering::shouldNormalizeToSelectSequence(LLVMContext
>> &,
>> + EVT) const
>> +{
>> + return false;
>> +}
>>
>> Modified: llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h
>> URL: http://llvm.org/viewvc/llvm-
>> project/llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h?rev=242436&
>> r1=242435&r2=242436&view=diff
>> ==========================================================
>> ====================
>> --- llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h (original)
>> +++ llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h Thu Jul 16
>> +++ 15:02:37 2015
>> @@ -58,6 +58,11 @@ enum NodeType : unsigned {
>> SBCS,
>> ANDS,
>>
>> + // Conditional compares. Operands: left,right,falsecc,cc,flags CCMP,
>> + CCMN, FCCMP,
>> +
>> // Floating point comparison
>> FCMP,
>>
>> @@ -516,6 +521,8 @@ private:
>> bool functionArgumentNeedsConsecutiveRegisters(Type *Ty,
>> CallingConv::ID
> CallConv,
>> bool isVarArg) const
> override;
>> +
>> + bool shouldNormalizeToSelectSequence(LLVMContext &, EVT) const
>> + override;
>> };
>>
>> namespace AArch64 {
>>
>> Modified: llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td
>> URL: http://llvm.org/viewvc/llvm-
>> project/llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td?rev=242436
>> &r1=242435&r2=242436&view=diff
>> ==========================================================
>> ====================
>> --- llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td (original)
>> +++ llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td Thu Jul 16
>> +++ 15:02:37 2015
>> @@ -525,6 +525,13 @@ def imm0_31 : Operand<i64>, ImmLeaf<i64,
>> let ParserMatchClass = Imm0_31Operand; }
>>
>> +// True if the 32-bit immediate is in the range [0,31] def imm32_0_31 :
>> +Operand<i32>, ImmLeaf<i32, [{
>> + return ((uint64_t)Imm) < 32;
>> +}]> {
>> + let ParserMatchClass = Imm0_31Operand; }
>> +
>> // imm0_15 predicate - True if the immediate is in the range [0,15] def
>> imm0_15 : Operand<i64>, ImmLeaf<i64, [{
>> return ((uint64_t)Imm) < 16;
>> @@ -542,7 +549,9 @@ def imm0_7 : Operand<i64>, ImmLeaf<i64, //
>> imm32_0_15 predicate - True if the 32-bit immediate is in the range [0,15]
>> def imm32_0_15 : Operand<i32>, ImmLeaf<i32, [{
>> return ((uint32_t)Imm) < 16;
>> -}]>;
>> +}]> {
>> + let ParserMatchClass = Imm0_15Operand; }
>>
>> // An arithmetic shifter operand:
>> // {7-6} - shift type: 00 = lsl, 01 = lsr, 10 = asr @@ -2108,9 +2117,12
> @@
>> multiclass LogicalRegS<bits<2> opc, bit
>> //---
>>
>> let mayLoad = 0, mayStore = 0, hasSideEffects = 0 in -class
>> BaseCondSetFlagsImm<bit op, RegisterClass regtype, string asm>
>> - : I<(outs), (ins regtype:$Rn, imm0_31:$imm, imm0_15:$nzcv,
>> ccode:$cond),
>> - asm, "\t$Rn, $imm, $nzcv, $cond", "", []>,
>> +class BaseCondComparisonImm<bit op, RegisterClass regtype, ImmLeaf
>> immtype,
>> + string mnemonic, SDNode OpNode>
>> + : I<(outs), (ins regtype:$Rn, immtype:$imm, imm32_0_15:$nzcv,
>> ccode:$cond),
>> + mnemonic, "\t$Rn, $imm, $nzcv, $cond", "",
>> + [(set NZCV, (OpNode regtype:$Rn, immtype:$imm, (i32 imm:$nzcv),
>> + (i32 imm:$cond), NZCV))]>,
>> Sched<[WriteI, ReadI]> {
>> let Uses = [NZCV];
>> let Defs = [NZCV];
>> @@ -2130,19 +2142,13 @@ class BaseCondSetFlagsImm<bit op, Regist
>> let Inst{3-0} = nzcv;
>> }
>>
>> -multiclass CondSetFlagsImm<bit op, string asm> {
>> - def Wi : BaseCondSetFlagsImm<op, GPR32, asm> {
>> - let Inst{31} = 0;
>> - }
>> - def Xi : BaseCondSetFlagsImm<op, GPR64, asm> {
>> - let Inst{31} = 1;
>> - }
>> -}
>> -
>> let mayLoad = 0, mayStore = 0, hasSideEffects = 0 in -class
>> BaseCondSetFlagsReg<bit op, RegisterClass regtype, string asm>
>> - : I<(outs), (ins regtype:$Rn, regtype:$Rm, imm0_15:$nzcv,
> ccode:$cond),
>> - asm, "\t$Rn, $Rm, $nzcv, $cond", "", []>,
>> +class BaseCondComparisonReg<bit op, RegisterClass regtype, string
>> mnemonic,
>> + SDNode OpNode>
>> + : I<(outs), (ins regtype:$Rn, regtype:$Rm, imm32_0_15:$nzcv,
>> ccode:$cond),
>> + mnemonic, "\t$Rn, $Rm, $nzcv, $cond", "",
>> + [(set NZCV, (OpNode regtype:$Rn, regtype:$Rm, (i32 imm:$nzcv),
>> + (i32 imm:$cond), NZCV))]>,
>> Sched<[WriteI, ReadI, ReadI]> {
>> let Uses = [NZCV];
>> let Defs = [NZCV];
>> @@ -2162,11 +2168,19 @@ class BaseCondSetFlagsReg<bit op, Regist
>> let Inst{3-0} = nzcv;
>> }
>>
>> -multiclass CondSetFlagsReg<bit op, string asm> {
>> - def Wr : BaseCondSetFlagsReg<op, GPR32, asm> {
>> +multiclass CondComparison<bit op, string mnemonic, SDNode OpNode> {
>> + // immediate operand variants
>> + def Wi : BaseCondComparisonImm<op, GPR32, imm32_0_31, mnemonic,
>> +OpNode> {
>> let Inst{31} = 0;
>> }
>> - def Xr : BaseCondSetFlagsReg<op, GPR64, asm> {
>> + def Xi : BaseCondComparisonImm<op, GPR64, imm0_31, mnemonic,
>> OpNode> {
>> + let Inst{31} = 1;
>> + }
>> + // register operand variants
>> + def Wr : BaseCondComparisonReg<op, GPR32, mnemonic, OpNode> {
>> + let Inst{31} = 0;
>> + }
>> + def Xr : BaseCondComparisonReg<op, GPR64, mnemonic, OpNode> {
>> let Inst{31} = 1;
>> }
>> }
>> @@ -3974,11 +3988,14 @@ multiclass FPComparison<bit signalAllNan
>> //---
>>
>> let mayLoad = 0, mayStore = 0, hasSideEffects = 0 in -class
>> BaseFPCondComparison<bit signalAllNans,
>> - RegisterClass regtype, string asm>
>> - : I<(outs), (ins regtype:$Rn, regtype:$Rm, imm0_15:$nzcv,
> ccode:$cond),
>> - asm, "\t$Rn, $Rm, $nzcv, $cond", "", []>,
>> +class BaseFPCondComparison<bit signalAllNans, RegisterClass regtype,
>> + string mnemonic, list<dag> pat>
>> + : I<(outs), (ins regtype:$Rn, regtype:$Rm, imm32_0_15:$nzcv,
>> ccode:$cond),
>> + mnemonic, "\t$Rn, $Rm, $nzcv, $cond", "", pat>,
>> Sched<[WriteFCmp]> {
>> + let Uses = [NZCV];
>> + let Defs = [NZCV];
>> +
>> bits<5> Rn;
>> bits<5> Rm;
>> bits<4> nzcv;
>> @@ -3994,16 +4011,18 @@ class BaseFPCondComparison<bit signalAll
>> let Inst{3-0} = nzcv;
>> }
>>
>> -multiclass FPCondComparison<bit signalAllNans, string asm> {
>> - let Defs = [NZCV], Uses = [NZCV] in {
>> - def Srr : BaseFPCondComparison<signalAllNans, FPR32, asm> {
>> +multiclass FPCondComparison<bit signalAllNans, string mnemonic,
>> + SDPatternOperator OpNode = null_frag> {
>> + def Srr : BaseFPCondComparison<signalAllNans, FPR32, mnemonic,
>> + [(set NZCV, (OpNode (f32 FPR32:$Rn), (f32 FPR32:$Rm), (i32
> imm:$nzcv),
>> + (i32 imm:$cond), NZCV))]> {
>> let Inst{22} = 0;
>> }
>> -
>> - def Drr : BaseFPCondComparison<signalAllNans, FPR64, asm> {
>> + def Drr : BaseFPCondComparison<signalAllNans, FPR64, mnemonic,
>> + [(set NZCV, (OpNode (f64 FPR64:$Rn), (f64 FPR64:$Rm), (i32
> imm:$nzcv),
>> + (i32 imm:$cond), NZCV))]> {
>> let Inst{22} = 1;
>> }
>> - } // Defs = [NZCV], Uses = [NZCV]
>> }
>>
>> //---
>>
>> Modified: llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.td
>> URL: http://llvm.org/viewvc/llvm-
>> project/llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.td?rev=242436&r1=
>> 242435&r2=242436&view=diff
>> ==========================================================
>> ====================
>> --- llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.td (original)
>> +++ llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.td Thu Jul 16
>> +++ 15:02:37 2015
>> @@ -66,6 +66,20 @@ def SDT_AArch64CSel : SDTypeProfile<1,
>> SDTCisSameAs<0, 2>,
>> SDTCisInt<3>,
>> SDTCisVT<4, i32>]>;
>> +def SDT_AArch64CCMP : SDTypeProfile<1, 5,
>> + [SDTCisVT<0, i32>,
>> + SDTCisInt<1>,
>> + SDTCisSameAs<1, 2>,
>> + SDTCisInt<3>,
>> + SDTCisInt<4>,
>> + SDTCisVT<5, i32>]>; def
>> +SDT_AArch64FCCMP : SDTypeProfile<1, 5,
>> + [SDTCisVT<0, i32>,
>> + SDTCisFP<1>,
>> + SDTCisSameAs<1, 2>,
>> + SDTCisInt<3>,
>> + SDTCisInt<4>,
>> + SDTCisVT<5, i32>]>;
>> def SDT_AArch64FCmp : SDTypeProfile<0, 2,
>> [SDTCisFP<0>,
>> SDTCisSameAs<0, 1>]>; @@ -160,6
> +174,10 @@ def
>> AArch64and_flag : SDNode<"AArch64IS def AArch64adc_flag :
>> SDNode<"AArch64ISD::ADCS", SDTBinaryArithWithFlagsInOut>; def
>> AArch64sbc_flag : SDNode<"AArch64ISD::SBCS",
>> SDTBinaryArithWithFlagsInOut>;
>>
>> +def AArch64ccmp : SDNode<"AArch64ISD::CCMP", SDT_AArch64CCMP>;
>> +def AArch64ccmn : SDNode<"AArch64ISD::CCMN", SDT_AArch64CCMP>;
>> +def AArch64fccmp : SDNode<"AArch64ISD::FCCMP",
>> SDT_AArch64FCCMP>;
>> +
>> def AArch64threadpointer : SDNode<"AArch64ISD::THREAD_POINTER",
>> SDTPtrLeaf>;
>>
>> def AArch64fcmp : SDNode<"AArch64ISD::FCMP", SDT_AArch64FCmp>;
>> @@ -1020,13 +1038,10 @@ def : InstAlias<"uxth $dst, $src", (UBFM def :
>> InstAlias<"uxtw $dst, $src", (UBFMXri GPR64:$dst, GPR64:$src, 0, 31)>;
>>
>>
> //===----------------------------------------------------------------------=
> ==//
>> -// Conditionally set flags instructions.
>> +// Conditional comparison instructions.
>>
> //===----------------------------------------------------------------------=
> ==//
>> -defm CCMN : CondSetFlagsImm<0, "ccmn">; -defm CCMP :
>> CondSetFlagsImm<1, "ccmp">;
>> -
>> -defm CCMN : CondSetFlagsReg<0, "ccmn">; -defm CCMP :
>> CondSetFlagsReg<1, "ccmp">;
>> +defm CCMN : CondComparison<0, "ccmn", AArch64ccmn>; defm CCMP :
>> +CondComparison<1, "ccmp", AArch64ccmp>;
>>
>>
> //===----------------------------------------------------------------------=
> ==//
>> // Conditional select instructions.
>> @@ -2556,7 +2571,7 @@ defm FCMP : FPComparison<0, "fcmp", AAr //===-
>> ---------------------------------------------------------------------===//
>>
>> defm FCCMPE : FPCondComparison<1, "fccmpe">; -defm FCCMP :
>> FPCondComparison<0, "fccmp">;
>> +defm FCCMP : FPCondComparison<0, "fccmp", AArch64fccmp>;
>>
>>
> //===----------------------------------------------------------------------=
> ==//
>> // Floating point conditional select instruction.
>>
>> Modified: llvm/trunk/test/CodeGen/AArch64/arm64-ccmp.ll
>> URL: http://llvm.org/viewvc/llvm-
>> project/llvm/trunk/test/CodeGen/AArch64/arm64-
>> ccmp.ll?rev=242436&r1=242435&r2=242436&view=diff
>> ==========================================================
>> ====================
>> --- llvm/trunk/test/CodeGen/AArch64/arm64-ccmp.ll (original)
>> +++ llvm/trunk/test/CodeGen/AArch64/arm64-ccmp.ll Thu Jul 16 15:02:37
>> +++ 2015
>> @@ -287,3 +287,99 @@ sw.bb.i.i:
>> %code1.i.i.phi.trans.insert = getelementptr inbounds %str1, %str1* %0,
> i64
>> 0, i32 0, i32 0, i64 16
>> br label %sw.bb.i.i
>> }
>> +
>> +; CHECK-LABEL: select_and
>> +define i64 @select_and(i32 %w0, i32 %w1, i64 %x2, i64 %x3) { ; CHECK:
>> +cmp w1, #5 ; CHECK-NEXT: ccmp w0, w1, #0, ne ; CHECK-NEXT: csel x0, x2,
>> +x3, lt ; CHECK-NEXT: ret
>> + %1 = icmp slt i32 %w0, %w1
>> + %2 = icmp ne i32 5, %w1
>> + %3 = and i1 %1, %2
>> + %sel = select i1 %3, i64 %x2, i64 %x3
>> + ret i64 %sel
>> +}
>> +
>> +; CHECK-LABEL: select_or
>> +define i64 @select_or(i32 %w0, i32 %w1, i64 %x2, i64 %x3) { ; CHECK:
>> +cmp w1, #5 ; CHECK-NEXT: ccmp w0, w1, #8, eq ; CHECK-NEXT: csel x0, x2,
>> +x3, lt ; CHECK-NEXT: ret
>> + %1 = icmp slt i32 %w0, %w1
>> + %2 = icmp ne i32 5, %w1
>> + %3 = or i1 %1, %2
>> + %sel = select i1 %3, i64 %x2, i64 %x3
>> + ret i64 %sel
>> +}
>> +
>> +; CHECK-LABEL: select_complicated
>> +define i16 @select_complicated(double %v1, double %v2, i16 %a, i16 %b)
>> +{ ; CHECK: ldr [[REG:d[0-9]+]], ; CHECK: fcmp d0, d2 ; CHECK-NEXT: fmov
>> +d2, #13.00000000 ; CHECK-NEXT: fccmp d1, d2, #4, ne ; CHECK-NEXT: fccmp
>> +d0, d1, #1, ne ; CHECK-NEXT: fccmp d0, d1, #4, vc ; CEHCK-NEXT: csel
>> +w0, w0, w1, eq
>> + %1 = fcmp one double %v1, %v2
>> + %2 = fcmp oeq double %v2, 13.0
>> + %3 = fcmp oeq double %v1, 42.0
>> + %or0 = or i1 %2, %3
>> + %or1 = or i1 %1, %or0
>> + %sel = select i1 %or1, i16 %a, i16 %b
>> + ret i16 %sel
>> +}
>> +
>> +; CHECK-LABEL: gccbug
>> +define i64 @gccbug(i64 %x0, i64 %x1) {
>> +; CHECK: cmp x1, #0
>> +; CHECK-NEXT: ccmp x0, #2, #0, eq
>> +; CHECK-NEXT: ccmp x0, #4, #4, ne
>> +; CHECK-NEXT: orr w[[REGNUM:[0-9]+]], wzr, #0x1 ; CHECK-NEXT: cinc x0,
>> +x[[REGNUM]], eq ; CHECK-NEXT: ret
>> + %cmp0 = icmp eq i64 %x1, 0
>> + %cmp1 = icmp eq i64 %x0, 2
>> + %cmp2 = icmp eq i64 %x0, 4
>> +
>> + %or = or i1 %cmp2, %cmp1
>> + %and = and i1 %or, %cmp0
>> +
>> + %sel = select i1 %and, i64 2, i64 1
>> + ret i64 %sel
>> +}
>> +
>> +; CHECK-LABEL: select_ororand
>> +define i32 @select_ororand(i32 %w0, i32 %w1, i32 %w2, i32 %w3) { ;
>> +CHECK: cmp w3, #4 ; CHECK-NEXT: ccmp w2, #2, #0, gt ; CHECK-NEXT: ccmp
>> +w1, #13, #2, ge ; CHECK-NEXT: ccmp w0, #0, #4, ls ; CHECK-NEXT: csel
>> +w0, w3, wzr, eq ; CHECK-NEXT: ret
>> + %c0 = icmp eq i32 %w0, 0
>> + %c1 = icmp ugt i32 %w1, 13
>> + %c2 = icmp slt i32 %w2, 2
>> + %c4 = icmp sgt i32 %w3, 4
>> + %or = or i1 %c0, %c1
>> + %and = and i1 %c2, %c4
>> + %or1 = or i1 %or, %and
>> + %sel = select i1 %or1, i32 %w3, i32 0
>> + ret i32 %sel
>> +}
>> +
>> +; CHECK-LABEL: select_noccmp
>> +define i64 @select_noccmp(i64 %v1, i64 %v2, i64 %v3, i64 %r) { ;
>> +CHECK-NOT: CCMP
>> + %c0 = icmp slt i64 %v1, 0
>> + %c1 = icmp sgt i64 %v1, 13
>> + %c2 = icmp slt i64 %v3, 2
>> + %c4 = icmp sgt i64 %v3, 4
>> + %and0 = and i1 %c0, %c1
>> + %and1 = and i1 %c2, %c4
>> + %or = or i1 %and0, %and1
>> + %sel = select i1 %or, i64 0, i64 %r
>> + ret i64 %sel
>> +}
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
> <reg.ll><reg.r242236.s><reg.s>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list