[LLVMdev] Illegal optimization in LLVM 2.8 during SelectionDAG? (Re: comparison pattern trouble - might be a bug in LLVM 2.8?)
Evan Cheng
evan.cheng at apple.com
Sat Oct 2 21:28:57 PDT 2010
On Oct 2, 2010, at 5:25 AM, Duncan Sands wrote:
> Hi,
>
>>> DAGCombiner::visitBRCOND() has code:
>>>
>>> SDValue N1 = N->getOperand(1);
>>> SDValue N2 = N->getOperand(2);
>>>
>>> ...
>>>
>>> SDNode *Trunc = 0;
>>> if (N1.getOpcode() == ISD::TRUNCATE&& N1.hasOneUse()) {
>>> // Look past truncate.
>>> Trunc = N1.getNode();
>>> N1 = N1.getOperand(0);
>>> }
>>>
>>> which just drops the truncate away.
>
> this looks wrong. According to the documentation of BRCOND,
>
> // BRCOND - Conditional branch. The first operand is the chain, the
> // second is the condition, the third is the block to branch to if the
> // condition is true. If the type of the condition is not i1, then the
>
> which means that on a machine with unsigned boolean values, bit 0 should be
> zero or 1 and the rest 0 (on a machine with signed boolean values, all bits
> should be 0, or all bits should be one). So just dropping the truncate seems
> bogus, since it results in a value that may not satisfy these conditions.
I'll try to find some time to look at it in the next few days. But if you have an idea on how to fix it properly, don't be shy about it. :-)
Evan
>
> Ciao,
>
> Duncan.
>
>>> then there is another optimization afterwards..
>>>
>>> // Transform br(xor(x, y)) -> br(x != y)
>>> // Transform br(xor(xor(x,y), 1)) -> br (x == y)
>>> if (N1.hasOneUse()&& N1.getOpcode() == ISD::XOR) {
>>> SDNode *TheXor = N1.getNode();
>>> SDValue Op0 = TheXor->getOperand(0);
>>> SDValue Op1 = TheXor->getOperand(1);
>>> if (Op0.getOpcode() == Op1.getOpcode()) {
>>> // Avoid missing important xor optimizations.
>>> SDValue Tmp = visitXOR(TheXor);
>>> if (Tmp.getNode()&& Tmp.getNode() != TheXor) {
>>> DEBUG(dbgs()<< "\nReplacing.8 ";
>>> TheXor->dump(&DAG);
>>> dbgs()<< "\nWith: ";
>>> Tmp.getNode()->dump(&DAG);
>>> dbgs()<< '\n');
>>> WorkListRemover DeadNodes(*this);
>>> DAG.ReplaceAllUsesOfValueWith(N1, Tmp,&DeadNodes);
>>> removeFromWorkList(TheXor);
>>> DAG.DeleteNode(TheXor);
>>> return DAG.getNode(ISD::BRCOND, N->getDebugLoc(),
>>> MVT::Other, Chain, Tmp, N2);
>>> }
>>> }
>>>
>>> if (Op0.getOpcode() != ISD::SETCC&& Op1.getOpcode() != ISD::SETCC) {
>>> bool Equal = false;
>>> if (ConstantSDNode *RHSCI = dyn_cast<ConstantSDNode>(Op0))
>>> if (RHSCI->getAPIntValue() == 1&& Op0.hasOneUse()&&
>>> Op0.getOpcode() == ISD::XOR) {
>>> TheXor = Op0.getNode();
>>> Equal = true;
>>> }
>>>
>>> SDValue NodeToReplace = Trunc ? SDValue(Trunc, 0) : N1;
>>>
>>> EVT SetCCVT = NodeToReplace.getValueType();
>>> if (LegalTypes)
>>> SetCCVT = TLI.getSetCCResultType(SetCCVT);
>>> SDValue SetCC = DAG.getSetCC(TheXor->getDebugLoc(),
>>> SetCCVT,
>>> Op0, Op1,
>>> Equal ? ISD::SETEQ : ISD::SETNE);
>>> // Replace the uses of XOR with SETCC
>>> WorkListRemover DeadNodes(*this);
>>> DAG.ReplaceAllUsesOfValueWith(NodeToReplace, SetCC,&DeadNodes);
>>> removeFromWorkList(NodeToReplace.getNode());
>>> DAG.DeleteNode(NodeToReplace.getNode());
>>> return DAG.getNode(ISD::BRCOND, N->getDebugLoc(),
>>> MVT::Other, Chain, SetCC, N2);
>>> }
>>> }
>>> }
>>>
>>>
>>> Which then optimizes the xor into setcc which looks ALL bits, not just
>>> the lowest bit.
>>>
>>>
>>>
>>>
>>> Optimized legalized selection DAG:
>>> SelectionDAG has 21 nodes:
>>> 0x2536968: ch = EntryToken [ORD=1] [ID=0]
>>>
>>> 0x2384a70: i32 = undef [ORD=1] [ID=2]
>>>
>>> 0x2536968:<multiple use>
>>> 0x2384b70: i32 = FrameIndex<-1> [ORD=1] [ID=1]
>>>
>>> 0x2384a70:<multiple use>
>>> 0x2384570: i32,ch = load 0x2536968, 0x2384b70,
>>> 0x2384a70<LD4[FixedStack-1]> [ORD=1] [ID=10]
>>>
>>> 0x2384470: ch = ValueType:i8 [ORD=1] [ID=4]
>>>
>>> 0x2384970: i32 = AssertZext 0x2384570, 0x2384470 [ORD=1] [ID=12]
>>>
>>> 0x2536968:<multiple use>
>>> 0x2384670: i32 = FrameIndex<-2> [ORD=2] [ID=3]
>>>
>>> 0x2384a70:<multiple use>
>>> 0x2384c70: i32,ch = load 0x2536968, 0x2384670,
>>> 0x2384a70<LD4[FixedStack-2]> [ORD=2] [ID=11]
>>>
>>> 0x2384370: ch = ValueType:i16 [ORD=2] [ID=5]
>>>
>>> 0x2384270: i32 = AssertZext 0x2384c70, 0x2384370 [ORD=2] [ID=13]
>>>
>>> 0x2536968:<multiple use>
>>> 0x2384d70: i32 = Register %reg16384 [ID=6]
>>>
>>> 0x2384270:<multiple use>
>>> 0x2384e70: ch = CopyToReg 0x2536968, 0x2384d70, 0x2384270 [ID=17]
>>>
>>> 0x2536968:<multiple use>
>>> 0x2385070: i32 = Register %reg16385 [ID=7]
>>>
>>> 0x2384970:<multiple use>
>>> 0x24b33f0: ch = CopyToReg 0x2536968, 0x2385070, 0x2384970 [ID=14]
>>>
>>> 0x2536968:<multiple use>
>>> 0x24b35f0: i32 = Register %reg16386 [ID=8]
>>>
>>> 0x2384270:<multiple use>
>>> 0x24b36f0: ch = CopyToReg 0x2536968, 0x24b35f0, 0x2384270 [ID=16]
>>>
>>> 0x24b40f0: ch = TokenFactor 0x2384e70, 0x24b33f0, 0x24b36f0 [ID=19]
>>>
>>> 0x2384270:<multiple use>
>>> 0x2384970:<multiple use>
>>> 0x24b38f0: ch = setne
>>>
>>> 0x24b39f0: i1 = setcc 0x2384270, 0x2384970, 0x24b38f0
>>>
>>> 0x24b3ff0: ch = BasicBlock< 0x25c5ff8> [ID=9]
>>>
>>> 0x24b41f0: ch = brcond 0x24b40f0, 0x24b39f0, 0x24b3ff0 [ID=20]
>>>
>>>
>>>
>>>
>>> Here we have lost the information that we are only comparing the lowest
>>> bits, as the trunc is gone, and the setcc is done with 32-bit
>>> comparison, comparing all bits.
>>>
>>>
>>>
>>> The way our backend is dynamically generated and loaded from a dynamic
>>> library into custom executable which also links to llvm makes it a bit
>>> hard to send a "easy to test" backend package for this situation.
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
More information about the llvm-dev
mailing list