[llvm] r259387 - AArch64: Implement missed conditional compare sequences.

Quentin Colombet via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 9 09:58:36 PDT 2016


> On Jun 9, 2016, at 9:19 AM, Balaram Makam <bmakam at codeaurora.org> wrote:
> 
> The transformation that helps my case is applicable during LegalizeDAG. It will try to construct a ccmp sequence when it sees And i32/i64, Or i32/i64 with setcc leafs. If the result is feedback into condition input of a node like BRCOND or CSEL it will pessimize the codegen. Since LegalizeDAG works in reverse topological order, the only way my transformation is applicable is when we previously had a zext i1->i32 or if we failed to construct ccmp sequence at SELECT_CC nodes because isConjunctionDisjunctionTree was not satisfied. If I want to move the transform to PerformDAGCombine and still keep using the utilities isConjunctionDisjunctionTree and emitConditionalComparison I will have to do before LegalizeOps phase, but since LowerSELECT_CC is not yet called I will have to additionally check if the result of AND/OR is feedback into BRCOND or CSEL and bail out conservatively. This will miss some opportunities where we would have transformed if it failed to construct a ccmp sequence at SELECT_CC nodes. If we want to catch all the opportunities, we need to do the transform afterLegalizeVectorOps phase, but now the setcc and cmp nodes are all lowered into AArch64 specific DAG nodes and the utilities isConjunctionDisjunctionTree andemitConditionalComparison will fail to match the pattern. So I don’t see an easy way to do this without a lot of refactoring.

Refactoring the utility functions seems the way to go. You may also catch more cases, if those AArch64 specific nodes can be introduced by other transformations.

>  
> -Balaram  
>  
> From: qcolombet at apple.com [mailto:qcolombet at apple.com] 
> Sent: Wednesday, June 08, 2016 6:32 PM
> To: Balaram Makam <bmakam at codeaurora.org>
> Cc: llvm-commits at lists.llvm.org
> Subject: Re: [llvm] r259387 - AArch64: Implement missed conditional compare sequences.
>  
> Hi Balaram,
>  
>> On Jun 8, 2016, at 2:10 PM, Balaram Makam <bmakam at codeaurora.org <mailto:bmakam at codeaurora.org>> wrote:
>>  
>> Hi Quentin,
>>  
>> I don’t see an easy way to achieve the same combine in PerformDAGCombine because it depends on the utilities emitConditionalComparison and isConjunctionDisjunctionTree that will now fail to match the AArch64 specific DAG nodes.
>  
> I am not sure I get why this is a problem. You should be able to use those within PerformDAGCombine, shouldn’t you?
> 
> 
>> Is there any other way we could do the custom lowering with just the isLegal part of the check?
>  
> No, there is not at legalization time.
> When does the combine, that help in your case, apply?
>  
> Cheers,
> -Quentin
> 
>>  
>> Regards,
>> Balaram
>>  
>> From: Quentin Colombet [mailto:qcolombet at apple.com <mailto:qcolombet at apple.com>] 
>> Sent: Monday, June 06, 2016 7:34 PM
>> To: Balaram Makam <bmakam at codeaurora.org <mailto:bmakam at codeaurora.org>>
>> Cc: llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>
>> Subject: Re: [llvm] r259387 - AArch64: Implement missed conditional compare sequences.
>>  
>> Hi Balaram,
>>  
>>> On Jun 6, 2016, at 2:43 PM, Balaram Makam <bmakam at codeaurora.org <mailto:bmakam at codeaurora.org>> wrote:
>>>  
>>> Hi Quentin,
>>>  
>>> I will take another look at this. I believe the problem is seen only on the internal branch. Do you have a test case to reproduce the issue?
>>  
>> Unfortunately, no, I do not have a test case for you.
>> Basically, this patch generates illegal code after legalization, and the problem is only the fact that we use Custom here to check legality.
>>  
>> You should be able to achieve the same combine in the AArch64 backend with just the isLegal part of the check.
>> There are a bunch of hooks during the lowering of SDAG and I am pretty sure you could find another way to do your custom lowering (I know I’ve already been in that situation :)). The "perform dag combine” that I mentioned should be one of them, but you have others.
>>  
>> Cheers,
>> -Quentin
>> 
>> 
>> 
>>>  
>>> Thanks,
>>> Balaram
>>>  
>>> From: Quentin Colombet [mailto:qcolombet at apple.com <mailto:qcolombet at apple.com>] 
>>> Sent: Monday, June 06, 2016 5:30 PM
>>> To: Balaram Makam <bmakam at codeaurora.org <mailto:bmakam at codeaurora.org>>
>>> Cc: llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>
>>> Subject: Re: [llvm] r259387 - AArch64: Implement missed conditional compare sequences.
>>>  
>>> Hi Balaram,
>>>  
>>> Sorry for the late reply, but I am seeing this commit is causing problem internally.
>>>  
>>>> On Feb 1, 2016, at 11:13 AM, Balaram Makam via llvm-commits <llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>> wrote:
>>>>  
>>>> Author: bmakam
>>>> Date: Mon Feb  1 13:13:07 2016
>>>> New Revision: 259387
>>>> 
>>>> URL: http://llvm.org/viewvc/llvm-project?rev=259387&view=rev <http://llvm.org/viewvc/llvm-project?rev=259387&view=rev>
>>>> Log:
>>>> AArch64: Implement missed conditional compare sequences.
>>>> 
>>>> Summary:
>>>> This is an extension to the existing implementation of r242436 which
>>>> restricts to only select inputs. This version fixes missed opportunities
>>>> in pr26084 by attempting to lower conditional compare sequences of
>>>> and/or trees with setcc leafs. This will additionaly handle the case
>>>> when a tree with select input is not a conjunction-disjunction tree
>>>> but some of the sub trees are conjunction-disjunction trees.
>>>> 
>>>> Reviewers: jmolloy, t.p.northover, mcrosier, MatzeB
>>>> 
>>>> Subscribers: mcrosier, llvm-commits, junbuml, haicheng, mssimpso, gberry
>>>> 
>>>> Differential Revision: http://reviews.llvm.org/D16291 <http://reviews.llvm.org/D16291>
>>>> 
>>>> Modified:
>>>>    llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
>>>>    llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp
>>>>    llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h
>>>>    llvm/trunk/test/CodeGen/AArch64/arm64-ccmp.ll
>>>> 
>>>> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=259387&r1=259386&r2=259387&view=diff <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=259387&r1=259386&r2=259387&view=diff>
>>>> ==============================================================================
>>>> --- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (original)
>>>> +++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Mon Feb  1 13:13:07 2016
>>>> @@ -1721,7 +1721,7 @@ SDValue DAGCombiner::visitADD(SDNode *N)
>>>>     return SDValue(N, 0);
>>>> 
>>>>   // fold (a+b) -> (a|b) iff a and b share no bits.
>>>> -  if ((!LegalOperations || TLI.isOperationLegal(ISD::OR, VT)) &&
>>>> +  if ((!LegalOperations || TLI.isOperationLegalOrCustom(ISD::OR, VT)) &&
>>>  
>>> This is wrong. Custom does not imply that the operation is Legal!
>>>  
>>> You said in the review that without custom, one of the lit test case was failing. I believe you can fix that by making the Custom lowering as part of the combine process (PerformDAGCombine.)
>>>  
>>> Anyhow, this line is generally not correct.
>>>  
>>> Cheers,
>>> -Quentin
>>>  
>>> 
>>> 
>>> 
>>> 
>>>>       VT.isInteger() && !VT.isVector() && DAG.haveNoCommonBitsSet(N0, N1))
>>>>     return DAG.getNode(ISD::OR, SDLoc(N), VT, N0, N1);
>>>> 
>>>> @@ -6363,7 +6363,7 @@ SDValue DAGCombiner::visitZERO_EXTEND(SD
>>>>       isa<LoadSDNode>(N0.getOperand(0)) &&
>>>>       N0.getOperand(1).getOpcode() == ISD::Constant &&
>>>>       TLI.isLoadExtLegal(ISD::ZEXTLOAD, VT, N0.getValueType()) &&
>>>> -      (!LegalOperations && TLI.isOperationLegal(N0.getOpcode(), VT))) {
>>>> +      (!LegalOperations && TLI.isOperationLegalOrCustom(N0.getOpcode(), VT))) {
>>>>     LoadSDNode *LN0 = cast<LoadSDNode>(N0.getOperand(0));
>>>>     if (LN0->getExtensionType() != ISD::SEXTLOAD && LN0->isUnindexed()) {
>>>>       bool DoXform = true;
>>>> 
>>>> Modified: llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp
>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp?rev=259387&r1=259386&r2=259387&view=diff <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp?rev=259387&r1=259386&r2=259387&view=diff>
>>>> ==============================================================================
>>>> --- llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp (original)
>>>> +++ llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp Mon Feb  1 13:13:07 2016
>>>> @@ -144,6 +144,16 @@ AArch64TargetLowering::AArch64TargetLowe
>>>>   setOperationAction(ISD::XOR, MVT::i32, Custom);
>>>>   setOperationAction(ISD::XOR, MVT::i64, Custom);
>>>> 
>>>> +  // Custom lowering hooks are needed for OR
>>>> +  // to fold it into CCMP.
>>>> +  setOperationAction(ISD::OR, MVT::i32, Custom);
>>>> +  setOperationAction(ISD::OR, MVT::i64, Custom);
>>>> +
>>>> +  // Custom lowering hooks are needed for AND
>>>> +  // to fold it into CCMP.
>>>> +  setOperationAction(ISD::AND, MVT::i32, Custom);
>>>> +  setOperationAction(ISD::AND, MVT::i64, Custom);
>>>> +
>>>>   // Virtually no operation on f128 is legal, but LLVM can't expand them when
>>>>   // there's a valid register class, so we need custom operations in most cases.
>>>>   setOperationAction(ISD::FABS, MVT::f128, Expand);
>>>> @@ -1597,6 +1607,27 @@ static SDValue getAArch64Cmp(SDValue LHS
>>>>   return Cmp;
>>>> }
>>>> 
>>>> +// Attempt to form conditional compare sequences for and/or trees
>>>> +// with setcc leafs.
>>>> +static SDValue tryLowerToAArch64Cmp(SDValue Op, SelectionDAG &DAG) {
>>>> +  SDValue LHS = Op.getOperand(0);
>>>> +  SDValue RHS = Op.getOperand(1);
>>>> +  if ((LHS.getOpcode() != ISD::SETCC) || (RHS.getOpcode() != ISD::SETCC))
>>>> +    return Op;
>>>> +
>>>> +  bool CanNegate;
>>>> +  if (!isConjunctionDisjunctionTree(Op, CanNegate))
>>>> +    return SDValue();
>>>> +
>>>> +  EVT VT = Op.getValueType();
>>>> +  SDLoc DL(Op);
>>>> +  SDValue TVal = DAG.getConstant(1, DL, VT);
>>>> +  SDValue FVal = DAG.getConstant(0, DL, VT);
>>>> +  SDValue CCVal;
>>>> +  SDValue Cmp = getAArch64Cmp(Op, FVal, ISD::SETEQ, CCVal, DAG, DL);
>>>> +  return DAG.getNode(AArch64ISD::CSEL, DL, VT, FVal, TVal, CCVal, Cmp);
>>>> +}
>>>> +
>>>> static std::pair<SDValue, SDValue>
>>>> getAArch64XALUOOp(AArch64CC::CondCode &CC, SDValue Op, SelectionDAG &DAG) {
>>>>   assert((Op.getValueType() == MVT::i32 || Op.getValueType() == MVT::i64) &&
>>>> @@ -1718,6 +1749,18 @@ SDValue AArch64TargetLowering::LowerF128
>>>>   return makeLibCall(DAG, Call, MVT::f128, Ops, false, SDLoc(Op)).first;
>>>> }
>>>> 
>>>> +SDValue AArch64TargetLowering::LowerAND(SDValue Op, SelectionDAG &DAG) const {
>>>> +  if (Op.getValueType().isVector())
>>>> +    return LowerVectorAND(Op, DAG);
>>>> +  return tryLowerToAArch64Cmp(Op, DAG);
>>>> +}
>>>> +
>>>> +SDValue AArch64TargetLowering::LowerOR(SDValue Op, SelectionDAG &DAG) const {
>>>> +  if (Op.getValueType().isVector())
>>>> +    return LowerVectorOR(Op, DAG);
>>>> +  return tryLowerToAArch64Cmp(Op, DAG);
>>>> +}
>>>> +
>>>> static SDValue LowerXOR(SDValue Op, SelectionDAG &DAG) {
>>>>   SDValue Sel = Op.getOperand(0);
>>>>   SDValue Other = Op.getOperand(1);
>>>> @@ -2372,9 +2415,9 @@ SDValue AArch64TargetLowering::LowerOper
>>>>   case ISD::FCOPYSIGN:
>>>>     return LowerFCOPYSIGN(Op, DAG);
>>>>   case ISD::AND:
>>>> -    return LowerVectorAND(Op, DAG);
>>>> +    return LowerAND(Op, DAG);
>>>>   case ISD::OR:
>>>> -    return LowerVectorOR(Op, DAG);
>>>> +    return LowerOR(Op, DAG);
>>>>   case ISD::XOR:
>>>>     return LowerXOR(Op, DAG);
>>>>   case ISD::PREFETCH:
>>>> 
>>>> Modified: llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h
>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h?rev=259387&r1=259386&r2=259387&view=diff <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h?rev=259387&r1=259386&r2=259387&view=diff>
>>>> ==============================================================================
>>>> --- llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h (original)
>>>> +++ llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h Mon Feb  1 13:13:07 2016
>>>> @@ -488,6 +488,8 @@ private:
>>>>   SDValue LowerCTPOP(SDValue Op, SelectionDAG &DAG) const;
>>>>   SDValue LowerF128Call(SDValue Op, SelectionDAG &DAG,
>>>>                         RTLIB::Libcall Call) const;
>>>> +  SDValue LowerAND(SDValue Op, SelectionDAG &DAG) const;
>>>> +  SDValue LowerOR(SDValue Op, SelectionDAG &DAG) const;
>>>>   SDValue LowerFCOPYSIGN(SDValue Op, SelectionDAG &DAG) const;
>>>>   SDValue LowerFP_EXTEND(SDValue Op, SelectionDAG &DAG) const;
>>>>   SDValue LowerFP_ROUND(SDValue Op, SelectionDAG &DAG) const;
>>>> 
>>>> Modified: llvm/trunk/test/CodeGen/AArch64/arm64-ccmp.ll
>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/arm64-ccmp.ll?rev=259387&r1=259386&r2=259387&view=diff <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/arm64-ccmp.ll?rev=259387&r1=259386&r2=259387&view=diff>
>>>> ==============================================================================
>>>> --- llvm/trunk/test/CodeGen/AArch64/arm64-ccmp.ll (original)
>>>> +++ llvm/trunk/test/CodeGen/AArch64/arm64-ccmp.ll Mon Feb  1 13:13:07 2016
>>>> @@ -371,21 +371,76 @@ define i32 @select_andor(i32 %v1, i32 %v
>>>>   ret i32 %sel
>>>> }
>>>> 
>>>> -; CHECK-LABEL: select_noccmp1
>>>> -define i64 @select_noccmp1(i64 %v1, i64 %v2, i64 %v3, i64 %r) {
>>>> -; CHECK: cmp x0, #0
>>>> -; CHECK-NEXT: cset [[REG0:w[0-9]+]], lt
>>>> -; CHECK-NEXT: cmp x0, #13
>>>> -; CHECK-NOT: ccmp
>>>> +; CHECK-LABEL: single_noselect
>>>> +define i32 @single_noselect(i32 %A, i32 %B) #0 {
>>>> +; CHECK: cmp w1, #1
>>>> +; CHECK-NEXT: ccmp w0, #1, #8, ge
>>>> +; CHECK-NEXT: cset w0, lt
>>>> +; CHECK-NEXT: ret
>>>> +  %notlhs = icmp slt i32 %A, 1
>>>> +  %notrhs = icmp slt i32 %B, 1
>>>> +  %lnot = or i1 %notlhs, %notrhs
>>>> +  %conv = zext i1 %lnot to i32
>>>> +  ret i32 %conv
>>>> +}
>>>> +
>>>> +; CHECK-LABEL: single_and_ext
>>>> +define i32 @single_and_ext(i32 %A, i32 %B, i32 %C) #0 {
>>>> +; CHECK: cmp w1, #2
>>>> +; CHECK-NEXT: ccmp w0, #4, #0, lt
>>>> +; CHECK-NEXT: cinc w0, w2, lt
>>>> +; CHECK-NEXT: ret
>>>> +  %cmp = icmp slt i32 %A, 4
>>>> +  %cmp1 = icmp slt i32 %B, 2
>>>> +  %and1 = and i1 %cmp, %cmp1
>>>> +  %conv = zext i1 %and1 to i32
>>>> +  %add = add nsw i32 %conv, %C
>>>> +  ret i32 %add
>>>> +}
>>>> +
>>>> +; CHECK-LABEL: single_noselect_phi
>>>> +define i32 @single_noselect_phi(i32 %A, i32 %B, i32 %C) #0 {
>>>> +; CHECK: cmp w1, #0
>>>> +; CHECK-NEXT: ccmp w0, #0, #4, gt
>>>> ; CHECK-NEXT: cset [[REG1:w[0-9]+]], gt
>>>> -; CHECK-NEXT: cmp x2, #2
>>>> +; CHECK-NEXT: cmp w1, #2
>>>> +; CHECK-NEXT: ccmp w0, #4, #8, ge
>>>> ; CHECK-NEXT: cset [[REG2:w[0-9]+]], lt
>>>> +; CHECK-NEXT: cmp w2, #0
>>>> +; CHECK-NEXT: csel w0, [[REG1]], [[REG2]], eq
>>>> +; CHECK-NEXT: ret
>>>> +entry:
>>>> +  %tobool = icmp eq i32 %C, 0
>>>> +  br i1 %tobool, label %if.else, label %if.then
>>>> +
>>>> +if.then:                                          ; preds = %entry
>>>> +  %cmp = icmp slt i32 %A, 4
>>>> +  %cmp1 = icmp slt i32 %B, 2
>>>> +  %0 = or i1 %cmp, %cmp1
>>>> +  br label %if.end
>>>> +
>>>> +if.else:                                          ; preds = %entry
>>>> +  %cmp2 = icmp sgt i32 %A, 0
>>>> +  %cmp3 = icmp sgt i32 %B, 0
>>>> +  %1 = and i1 %cmp2, %cmp3
>>>> +  br label %if.end
>>>> +
>>>> +if.end:                                           ; preds = %if.else, %if.then
>>>> +  %b.0.in <http://b.0.in/> = phi i1 [ %0, %if.then ], [ %1, %if.else ]
>>>> +  %conv = zext i1 %b.0.in <http://b.0.in/> to i32
>>>> +  ret i32 %conv
>>>> +}
>>>> +
>>>> +; CHECK-LABEL: select_noccmp1
>>>> +define i64 @select_noccmp1(i64 %v1, i64 %v2, i64 %v3, i64 %r) {
>>>> +; CHECK: cmp x0, #13
>>>> +; CHECK-NEXT: ccmp x0, #0, #0, gt
>>>> +; CHECK-NEXT: cset [[REG1:w[0-9]+]], lt
>>>> ; CHECK-NEXT: cmp x2, #4
>>>> -; CHECK-NEXT: cset [[REG3:w[0-9]+]], gt
>>>> -; CHECK-NEXT: and [[REG4:w[0-9]+]], [[REG0]], [[REG1]]
>>>> -; CHECK-NEXT: and [[REG5:w[0-9]+]], [[REG2]], [[REG3]]
>>>> -; CHECK-NEXT: orr [[REG6:w[0-9]+]], [[REG4]], [[REG5]]
>>>> -; CHECK-NEXT: cmp [[REG6]], #0
>>>> +; CHECK-NEXT: ccmp x2, #2, #0, gt
>>>> +; CHECK-NEXT: cset [[REG2:w[0-9]+]], lt
>>>> +; CHECK-NEXT: orr [[REG3:w[0-9]+]], [[REG1]], [[REG2]]
>>>> +; CHECK-NEXT: cmp [[REG3]], #0
>>>> ; CHECK-NEXT: csel x0, xzr, x3, ne
>>>> ; CHECK-NEXT: ret
>>>>   %c0 = icmp slt i64 %v1, 0
>>>> 
>>>> 
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160609/d5f5ecc0/attachment.html>


More information about the llvm-commits mailing list