[LLVMdev] Illegal optimization in LLVM 2.8 during SelectionDAG? (Re: comparison pattern trouble - might be a bug in LLVM 2.8?)
Duncan Sands
baldrick at free.fr
Sat Oct 2 05:25:12 PDT 2010
Hi,
>> DAGCombiner::visitBRCOND() has code:
>>
>> SDValue N1 = N->getOperand(1);
>> SDValue N2 = N->getOperand(2);
>>
>> ...
>>
>> SDNode *Trunc = 0;
>> if (N1.getOpcode() == ISD::TRUNCATE&& N1.hasOneUse()) {
>> // Look past truncate.
>> Trunc = N1.getNode();
>> N1 = N1.getOperand(0);
>> }
>>
>> which just drops the truncate away.
this looks wrong. According to the documentation of BRCOND,
// BRCOND - Conditional branch. The first operand is the chain, the
// second is the condition, the third is the block to branch to if the
// condition is true. If the type of the condition is not i1, then the
which means that on a machine with unsigned boolean values, bit 0 should be
zero or 1 and the rest 0 (on a machine with signed boolean values, all bits
should be 0, or all bits should be one). So just dropping the truncate seems
bogus, since it results in a value that may not satisfy these conditions.
Ciao,
Duncan.
>> then there is another optimization afterwards..
>>
>> // Transform br(xor(x, y)) -> br(x != y)
>> // Transform br(xor(xor(x,y), 1)) -> br (x == y)
>> if (N1.hasOneUse()&& N1.getOpcode() == ISD::XOR) {
>> SDNode *TheXor = N1.getNode();
>> SDValue Op0 = TheXor->getOperand(0);
>> SDValue Op1 = TheXor->getOperand(1);
>> if (Op0.getOpcode() == Op1.getOpcode()) {
>> // Avoid missing important xor optimizations.
>> SDValue Tmp = visitXOR(TheXor);
>> if (Tmp.getNode()&& Tmp.getNode() != TheXor) {
>> DEBUG(dbgs()<< "\nReplacing.8 ";
>> TheXor->dump(&DAG);
>> dbgs()<< "\nWith: ";
>> Tmp.getNode()->dump(&DAG);
>> dbgs()<< '\n');
>> WorkListRemover DeadNodes(*this);
>> DAG.ReplaceAllUsesOfValueWith(N1, Tmp,&DeadNodes);
>> removeFromWorkList(TheXor);
>> DAG.DeleteNode(TheXor);
>> return DAG.getNode(ISD::BRCOND, N->getDebugLoc(),
>> MVT::Other, Chain, Tmp, N2);
>> }
>> }
>>
>> if (Op0.getOpcode() != ISD::SETCC&& Op1.getOpcode() != ISD::SETCC) {
>> bool Equal = false;
>> if (ConstantSDNode *RHSCI = dyn_cast<ConstantSDNode>(Op0))
>> if (RHSCI->getAPIntValue() == 1&& Op0.hasOneUse()&&
>> Op0.getOpcode() == ISD::XOR) {
>> TheXor = Op0.getNode();
>> Equal = true;
>> }
>>
>> SDValue NodeToReplace = Trunc ? SDValue(Trunc, 0) : N1;
>>
>> EVT SetCCVT = NodeToReplace.getValueType();
>> if (LegalTypes)
>> SetCCVT = TLI.getSetCCResultType(SetCCVT);
>> SDValue SetCC = DAG.getSetCC(TheXor->getDebugLoc(),
>> SetCCVT,
>> Op0, Op1,
>> Equal ? ISD::SETEQ : ISD::SETNE);
>> // Replace the uses of XOR with SETCC
>> WorkListRemover DeadNodes(*this);
>> DAG.ReplaceAllUsesOfValueWith(NodeToReplace, SetCC,&DeadNodes);
>> removeFromWorkList(NodeToReplace.getNode());
>> DAG.DeleteNode(NodeToReplace.getNode());
>> return DAG.getNode(ISD::BRCOND, N->getDebugLoc(),
>> MVT::Other, Chain, SetCC, N2);
>> }
>> }
>> }
>>
>>
>> Which then optimizes the xor into setcc which looks ALL bits, not just
>> the lowest bit.
>>
>>
>>
>>
>> Optimized legalized selection DAG:
>> SelectionDAG has 21 nodes:
>> 0x2536968: ch = EntryToken [ORD=1] [ID=0]
>>
>> 0x2384a70: i32 = undef [ORD=1] [ID=2]
>>
>> 0x2536968:<multiple use>
>> 0x2384b70: i32 = FrameIndex<-1> [ORD=1] [ID=1]
>>
>> 0x2384a70:<multiple use>
>> 0x2384570: i32,ch = load 0x2536968, 0x2384b70,
>> 0x2384a70<LD4[FixedStack-1]> [ORD=1] [ID=10]
>>
>> 0x2384470: ch = ValueType:i8 [ORD=1] [ID=4]
>>
>> 0x2384970: i32 = AssertZext 0x2384570, 0x2384470 [ORD=1] [ID=12]
>>
>> 0x2536968:<multiple use>
>> 0x2384670: i32 = FrameIndex<-2> [ORD=2] [ID=3]
>>
>> 0x2384a70:<multiple use>
>> 0x2384c70: i32,ch = load 0x2536968, 0x2384670,
>> 0x2384a70<LD4[FixedStack-2]> [ORD=2] [ID=11]
>>
>> 0x2384370: ch = ValueType:i16 [ORD=2] [ID=5]
>>
>> 0x2384270: i32 = AssertZext 0x2384c70, 0x2384370 [ORD=2] [ID=13]
>>
>> 0x2536968:<multiple use>
>> 0x2384d70: i32 = Register %reg16384 [ID=6]
>>
>> 0x2384270:<multiple use>
>> 0x2384e70: ch = CopyToReg 0x2536968, 0x2384d70, 0x2384270 [ID=17]
>>
>> 0x2536968:<multiple use>
>> 0x2385070: i32 = Register %reg16385 [ID=7]
>>
>> 0x2384970:<multiple use>
>> 0x24b33f0: ch = CopyToReg 0x2536968, 0x2385070, 0x2384970 [ID=14]
>>
>> 0x2536968:<multiple use>
>> 0x24b35f0: i32 = Register %reg16386 [ID=8]
>>
>> 0x2384270:<multiple use>
>> 0x24b36f0: ch = CopyToReg 0x2536968, 0x24b35f0, 0x2384270 [ID=16]
>>
>> 0x24b40f0: ch = TokenFactor 0x2384e70, 0x24b33f0, 0x24b36f0 [ID=19]
>>
>> 0x2384270:<multiple use>
>> 0x2384970:<multiple use>
>> 0x24b38f0: ch = setne
>>
>> 0x24b39f0: i1 = setcc 0x2384270, 0x2384970, 0x24b38f0
>>
>> 0x24b3ff0: ch = BasicBlock< 0x25c5ff8> [ID=9]
>>
>> 0x24b41f0: ch = brcond 0x24b40f0, 0x24b39f0, 0x24b3ff0 [ID=20]
>>
>>
>>
>>
>> Here we have lost the information that we are only comparing the lowest
>> bits, as the trunc is gone, and the setcc is done with 32-bit
>> comparison, comparing all bits.
>>
>>
>>
>> The way our backend is dynamically generated and loaded from a dynamic
>> library into custom executable which also links to llvm makes it a bit
>> hard to send a "easy to test" backend package for this situation.
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
More information about the llvm-dev
mailing list