[llvm] r315272 - Renable r314928
Xinliang David Li via llvm-commits
llvm-commits at lists.llvm.org
Thu May 17 10:46:46 PDT 2018
Sounds good to me. I will drop my two line patch then.
On Thu, May 17, 2018 at 10:33 AM Michael Zolotukhin <mzolotukhin at apple.com>
wrote:
>
>
> On May 17, 2018, at 10:22 AM, Xinliang David Li <davidxl at google.com>
> wrote:
>
> I agree in principle. However there is a also advantages of keeping this
> in instcombine so creating a separate pass to handle the extreme rare case
> seems like an overkill to though I do object to anyone reworking the
> optimization in that direction.
>
> I agree a separate pass would be an overkill, but I was thinking more in
> line of AggressiveInstCombine, which seems to fit perfectly for this case.
> WDYT?
>
> For the time being, I have a patch ready for review.
>
> Thanks!
>
> Michael
>
>
> thanks,
>
> David
>
> On Thu, May 17, 2018 at 9:08 AM Michael Zolotukhin <mzolotukhin at apple.com>
> wrote:
>
>> A limit would definitely help here, but don’t we have a more fundamental
>> problem here? I think the root of all evils is that in a local to an
>> instruction routine we’re iterating over all instructions in its parent.
>> Would it make more sense to convert it to a separate function pass (from an
>> instruction visitor)? This way we can probably save some unnecessary
>> traversals.
>>
>> Michael
>>
>> On May 16, 2018, at 10:06 PM, Xinliang David Li <davidxl at google.com>
>> wrote:
>>
>> Some kind of limit is probably needed. I can take a look at it if you do
>> not beat me to it.
>>
>> David
>>
>>
>> On Wed, May 16, 2018 at 6:57 PM Mikhail Zolotukhin <mzolotukhin at apple.com>
>> wrote:
>>
>>> Hi David,
>>>
>>> We found a huge compile time regression in an internal clang build
>>> caused by this change. I looked at the change and it seems that we’re
>>> scanning all phi-nodes in the basic block for every phi, which makes the
>>> algorithm quadratic. Of course, it’s not always the case, but as our build
>>> showed, it’s quite possible. How can we address it?
>>>
>>> Here is a simple script to expose the issue:
>>>
>>>
>>> Usage:
>>> > for i in `seq 4000 2000 20000`; do echo $i; python genphis.py $i |
>>> time opt -instcombine -o /dev/null; done
>>>
>>> 4000
>>> 0.20 real 0.16 user 0.00 sys
>>> 6000
>>> 0.37 real 0.32 user 0.00 sys
>>> 8000
>>> 0.98 real 0.92 user 0.01 sys
>>> 10000
>>> 2.65 real 2.58 user 0.01 sys
>>> 12000
>>> 3.79 real 3.71 user 0.02 sys
>>> 14000
>>> 5.08 real 4.99 user 0.02 sys
>>> 16000
>>> 6.80 real 6.71 user 0.03 sys
>>> 18000
>>> 8.41 real 8.31 user 0.03 sys
>>> 20000
>>> 10.43 real 10.31 user 0.04 sys
>>>
>>>
>>> Thanks,
>>> Michael
>>>
>>> On Oct 9, 2017, at 10:07 PM, Xinliang David Li via llvm-commits <
>>> llvm-commits at lists.llvm.org> wrote:
>>>
>>> Author: davidxl
>>> Date: Mon Oct 9 22:07:54 2017
>>> New Revision: 315272
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=315272&view=rev
>>> Log:
>>> Renable r314928
>>>
>>>
>>> Eliminate inttype phi with inttoptr/ptrtoint.
>>>
>>> This version fixed a bug in finding the matching
>>> phi -- the order of the incoming blocks may be
>>> different (triggered in self build on Windows).
>>> A new test case is added.
>>>
>>> Added:
>>> llvm/trunk/test/Transforms/InstCombine/intptr1.ll
>>> - copied unchanged from r315107,
>>> llvm/trunk/test/Transforms/InstCombine/intptr1.ll
>>> llvm/trunk/test/Transforms/InstCombine/intptr2.ll
>>> - copied unchanged from r315107,
>>> llvm/trunk/test/Transforms/InstCombine/intptr2.ll
>>> llvm/trunk/test/Transforms/InstCombine/intptr3.ll
>>> - copied unchanged from r315107,
>>> llvm/trunk/test/Transforms/InstCombine/intptr3.ll
>>> llvm/trunk/test/Transforms/InstCombine/intptr4.ll
>>> - copied unchanged from r315107,
>>> llvm/trunk/test/Transforms/InstCombine/intptr4.ll
>>> llvm/trunk/test/Transforms/InstCombine/intptr5.ll
>>> - copied unchanged from r315107,
>>> llvm/trunk/test/Transforms/InstCombine/intptr5.ll
>>> llvm/trunk/test/Transforms/InstCombine/intptr6.ll
>>> - copied unchanged from r315107,
>>> llvm/trunk/test/Transforms/InstCombine/intptr6.ll
>>> llvm/trunk/test/Transforms/InstCombine/intptr7.ll
>>> Modified:
>>> llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h
>>> llvm/trunk/lib/Transforms/InstCombine/InstCombinePHI.cpp
>>>
>>> Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h?rev=315272&r1=315271&r2=315272&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h
>>> (original)
>>> +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h Mon Oct
>>> 9 22:07:54 2017
>>> @@ -670,6 +670,10 @@ private:
>>> Instruction *FoldPHIArgGEPIntoPHI(PHINode &PN);
>>> Instruction *FoldPHIArgLoadIntoPHI(PHINode &PN);
>>> Instruction *FoldPHIArgZextsIntoPHI(PHINode &PN);
>>> + /// If an integer typed PHI has only one use which is an IntToPtr
>>> operation,
>>> + /// replace the PHI with an existing pointer typed PHI if it exists.
>>> Otherwise
>>> + /// insert a new pointer typed PHI and replace the original one.
>>> + Instruction *FoldIntegerTypedPHI(PHINode &PN);
>>>
>>> /// Helper function for FoldPHIArgXIntoPHI() to set debug location for
>>> the
>>> /// folded operation.
>>>
>>> Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombinePHI.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombinePHI.cpp?rev=315272&r1=315271&r2=315272&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/Transforms/InstCombine/InstCombinePHI.cpp (original)
>>> +++ llvm/trunk/lib/Transforms/InstCombine/InstCombinePHI.cpp Mon Oct 9
>>> 22:07:54 2017
>>> @@ -40,6 +40,238 @@ void InstCombiner::PHIArgMergedDebugLoc(
>>> }
>>> }
>>>
>>> +// Replace Integer typed PHI PN if the PHI's value is used as a pointer
>>> value.
>>> +// If there is an existing pointer typed PHI that produces the same
>>> value as PN,
>>> +// replace PN and the IntToPtr operation with it. Otherwise, synthesize
>>> a new
>>> +// PHI node:
>>> +//
>>> +// Case-1:
>>> +// bb1:
>>> +// int_init = PtrToInt(ptr_init)
>>> +// br label %bb2
>>> +// bb2:
>>> +// int_val = PHI([int_init, %bb1], [int_val_inc, %bb2]
>>> +// ptr_val = PHI([ptr_init, %bb1], [ptr_val_inc, %bb2]
>>> +// ptr_val2 = IntToPtr(int_val)
>>> +// ...
>>> +// use(ptr_val2)
>>> +// ptr_val_inc = ...
>>> +// inc_val_inc = PtrToInt(ptr_val_inc)
>>> +//
>>> +// ==>
>>> +// bb1:
>>> +// br label %bb2
>>> +// bb2:
>>> +// ptr_val = PHI([ptr_init, %bb1], [ptr_val_inc, %bb2]
>>> +// ...
>>> +// use(ptr_val)
>>> +// ptr_val_inc = ...
>>> +//
>>> +// Case-2:
>>> +// bb1:
>>> +// int_ptr = BitCast(ptr_ptr)
>>> +// int_init = Load(int_ptr)
>>> +// br label %bb2
>>> +// bb2:
>>> +// int_val = PHI([int_init, %bb1], [int_val_inc, %bb2]
>>> +// ptr_val2 = IntToPtr(int_val)
>>> +// ...
>>> +// use(ptr_val2)
>>> +// ptr_val_inc = ...
>>> +// inc_val_inc = PtrToInt(ptr_val_inc)
>>> +// ==>
>>> +// bb1:
>>> +// ptr_init = Load(ptr_ptr)
>>> +// br label %bb2
>>> +// bb2:
>>> +// ptr_val = PHI([ptr_init, %bb1], [ptr_val_inc, %bb2]
>>> +// ...
>>> +// use(ptr_val)
>>> +// ptr_val_inc = ...
>>> +// ...
>>> +//
>>> +Instruction *InstCombiner::FoldIntegerTypedPHI(PHINode &PN) {
>>> + if (!PN.getType()->isIntegerTy())
>>> + return nullptr;
>>> + if (!PN.hasOneUse())
>>> + return nullptr;
>>> +
>>> + auto *IntToPtr = dyn_cast<IntToPtrInst>(PN.user_back());
>>> + if (!IntToPtr)
>>> + return nullptr;
>>> +
>>> + // Check if the pointer is actually used as pointer:
>>> + auto HasPointerUse = [](Instruction *IIP) {
>>> + for (User *U : IIP->users()) {
>>> + Value *Ptr = nullptr;
>>> + if (LoadInst *LoadI = dyn_cast<LoadInst>(U)) {
>>> + Ptr = LoadI->getPointerOperand();
>>> + } else if (StoreInst *SI = dyn_cast<StoreInst>(U)) {
>>> + Ptr = SI->getPointerOperand();
>>> + } else if (GetElementPtrInst *GI =
>>> dyn_cast<GetElementPtrInst>(U)) {
>>> + Ptr = GI->getPointerOperand();
>>> + }
>>> +
>>> + if (Ptr && Ptr == IIP)
>>> + return true;
>>> + }
>>> + return false;
>>> + };
>>> +
>>> + if (!HasPointerUse(IntToPtr))
>>> + return nullptr;
>>> +
>>> + if (DL.getPointerSizeInBits(IntToPtr->getAddressSpace()) !=
>>> + DL.getTypeSizeInBits(IntToPtr->getOperand(0)->getType()))
>>> + return nullptr;
>>> +
>>> + SmallVector<Value *, 4> AvailablePtrVals;
>>> + for (unsigned i = 0; i != PN.getNumIncomingValues(); ++i) {
>>> + Value *Arg = PN.getIncomingValue(i);
>>> +
>>> + // First look backward:
>>> + if (auto *PI = dyn_cast<PtrToIntInst>(Arg)) {
>>> + AvailablePtrVals.emplace_back(PI->getOperand(0));
>>> + continue;
>>> + }
>>> +
>>> + // Next look forward:
>>> + Value *ArgIntToPtr = nullptr;
>>> + for (User *U : Arg->users()) {
>>> + if (isa<IntToPtrInst>(U) && U->getType() == IntToPtr->getType() &&
>>> + (DT.dominates(cast<Instruction>(U), PN.getIncomingBlock(i)) ||
>>> + cast<Instruction>(U)->getParent() ==
>>> PN.getIncomingBlock(i))) {
>>> + ArgIntToPtr = U;
>>> + break;
>>> + }
>>> + }
>>> +
>>> + if (ArgIntToPtr) {
>>> + AvailablePtrVals.emplace_back(ArgIntToPtr);
>>> + continue;
>>> + }
>>> +
>>> + // If Arg is defined by a PHI, allow it. This will also create
>>> + // more opportunities iteratively.
>>> + if (isa<PHINode>(Arg)) {
>>> + AvailablePtrVals.emplace_back(Arg);
>>> + continue;
>>> + }
>>> +
>>> + // For a single use integer load:
>>> + auto *LoadI = dyn_cast<LoadInst>(Arg);
>>> + if (!LoadI)
>>> + return nullptr;
>>> +
>>> + if (!LoadI->hasOneUse())
>>> + return nullptr;
>>> +
>>> + // Push the integer typed Load instruction into the available
>>> + // value set, and fix it up later when the pointer typed PHI
>>> + // is synthesized.
>>> + AvailablePtrVals.emplace_back(LoadI);
>>> + }
>>> +
>>> + // Now search for a matching PHI
>>> + auto *BB = PN.getParent();
>>> + assert(AvailablePtrVals.size() == PN.getNumIncomingValues() &&
>>> + "Not enough available ptr typed incoming values");
>>> + PHINode *MatchingPtrPHI = nullptr;
>>> + for (auto II = BB->begin(), EI =
>>> BasicBlock::iterator(BB->getFirstNonPHI());
>>> + II != EI; II++) {
>>> + PHINode *PtrPHI = dyn_cast<PHINode>(II);
>>> + if (!PtrPHI || PtrPHI == &PN || PtrPHI->getType() !=
>>> IntToPtr->getType())
>>> + continue;
>>> + MatchingPtrPHI = PtrPHI;
>>> + for (unsigned i = 0; i != PtrPHI->getNumIncomingValues(); ++i) {
>>> + if (AvailablePtrVals[i] !=
>>> + PtrPHI->getIncomingValueForBlock(PN.getIncomingBlock(i))) {
>>> + MatchingPtrPHI = nullptr;
>>> + break;
>>> + }
>>> + }
>>> +
>>> + if (MatchingPtrPHI)
>>> + break;
>>> + }
>>> +
>>> + if (MatchingPtrPHI) {
>>> + assert(MatchingPtrPHI->getType() == IntToPtr->getType() &&
>>> + "Phi's Type does not match with IntToPtr");
>>> + // The PtrToCast + IntToPtr will be simplified later
>>> + return CastInst::CreateBitOrPointerCast(MatchingPtrPHI,
>>> +
>>> IntToPtr->getOperand(0)->getType());
>>> + }
>>> +
>>> + // If it requires a conversion for every PHI operand, do not do it.
>>> + if (std::all_of(AvailablePtrVals.begin(), AvailablePtrVals.end(),
>>> + [&](Value *V) {
>>> + return (V->getType() != IntToPtr->getType()) ||
>>> + isa<IntToPtrInst>(V);
>>> + }))
>>> + return nullptr;
>>> +
>>> + // If any of the operand that requires casting is a terminator
>>> + // instruction, do not do it.
>>> + if (std::any_of(AvailablePtrVals.begin(), AvailablePtrVals.end(),
>>> + [&](Value *V) {
>>> + return (V->getType() != IntToPtr->getType()) &&
>>> + isa<TerminatorInst>(V);
>>> + }))
>>> + return nullptr;
>>> +
>>> + PHINode *NewPtrPHI = PHINode::Create(
>>> + IntToPtr->getType(), PN.getNumIncomingValues(), PN.getName() +
>>> ".ptr");
>>> +
>>> + InsertNewInstBefore(NewPtrPHI, PN);
>>> + SmallDenseMap<Value *, Instruction *> Casts;
>>> + for (unsigned i = 0; i != PN.getNumIncomingValues(); ++i) {
>>> + auto *IncomingBB = PN.getIncomingBlock(i);
>>> + auto *IncomingVal = AvailablePtrVals[i];
>>> +
>>> + if (IncomingVal->getType() == IntToPtr->getType()) {
>>> + NewPtrPHI->addIncoming(IncomingVal, IncomingBB);
>>> + continue;
>>> + }
>>> +
>>> +#ifndef NDEBUG
>>> + LoadInst *LoadI = dyn_cast<LoadInst>(IncomingVal);
>>> + assert((isa<PHINode>(IncomingVal) ||
>>> + IncomingVal->getType()->isPointerTy() ||
>>> + (LoadI && LoadI->hasOneUse())) &&
>>> + "Can not replace LoadInst with multiple uses");
>>> +#endif
>>> + // Need to insert a BitCast.
>>> + // For an integer Load instruction with a single use, the load +
>>> IntToPtr
>>> + // cast will be simplified into a pointer load:
>>> + // %v = load i64, i64* %a.ip, align 8
>>> + // %v.cast = inttoptr i64 %v to float **
>>> + // ==>
>>> + // %v.ptrp = bitcast i64 * %a.ip to float **
>>> + // %v.cast = load float *, float ** %v.ptrp, align 8
>>> + Instruction *&CI = Casts[IncomingVal];
>>> + if (!CI) {
>>> + CI = CastInst::CreateBitOrPointerCast(IncomingVal,
>>> IntToPtr->getType(),
>>> + IncomingVal->getName() +
>>> ".ptr");
>>> + if (auto *IncomingI = dyn_cast<Instruction>(IncomingVal)) {
>>> + BasicBlock::iterator InsertPos(IncomingI);
>>> + InsertPos++;
>>> + if (isa<PHINode>(IncomingI))
>>> + InsertPos = IncomingI->getParent()->getFirstInsertionPt();
>>> + InsertNewInstBefore(CI, *InsertPos);
>>> + } else {
>>> + auto *InsertBB = &IncomingBB->getParent()->getEntryBlock();
>>> + InsertNewInstBefore(CI, *InsertBB->getFirstInsertionPt());
>>> + }
>>> + }
>>> + NewPtrPHI->addIncoming(CI, IncomingBB);
>>> + }
>>> +
>>> + // The PtrToCast + IntToPtr will be simplified later
>>> + return CastInst::CreateBitOrPointerCast(NewPtrPHI,
>>> +
>>> IntToPtr->getOperand(0)->getType());
>>> +}
>>> +
>>> /// If we have something like phi [add (a,b), add(a,c)] and if a/b/c and
>>> the
>>> /// adds all have a single use, turn this into a phi and a single binop.
>>> Instruction *InstCombiner::FoldPHIArgBinOpIntoPHI(PHINode &PN) {
>>> @@ -903,6 +1135,9 @@ Instruction *InstCombiner::visitPHINode(
>>> // this PHI only has a single use (a PHI), and if that PHI only has
>>> one use (a
>>> // PHI)... break the cycle.
>>> if (PN.hasOneUse()) {
>>> + if (Instruction *Result = FoldIntegerTypedPHI(PN))
>>> + return Result;
>>> +
>>> Instruction *PHIUser = cast<Instruction>(PN.user_back());
>>> if (PHINode *PU = dyn_cast<PHINode>(PHIUser)) {
>>> SmallPtrSet<PHINode*, 16> PotentiallyDeadPHIs;
>>>
>>> Added: llvm/trunk/test/Transforms/InstCombine/intptr7.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/intptr7.ll?rev=315272&view=auto
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/Transforms/InstCombine/intptr7.ll (added)
>>> +++ llvm/trunk/test/Transforms/InstCombine/intptr7.ll Mon Oct 9
>>> 22:07:54 2017
>>> @@ -0,0 +1,58 @@
>>> +; RUN: opt < %s -instcombine -S | FileCheck %s
>>> +
>>> +define void @matching_phi(i64 %a, float* %b, i1 %cond) {
>>> +; CHECK-LABEL: @matching_phi
>>> +entry:
>>> + %cmp1 = icmp eq i1 %cond, 0
>>> + %add.int = add i64 %a, 1
>>> + %add = inttoptr i64 %add.int to float *
>>> +
>>> + %addb = getelementptr inbounds float, float* %b, i64 2
>>> + %addb.int = ptrtoint float* %addb to i64
>>> + br i1 %cmp1, label %A, label %B
>>> +A:
>>> + br label %C
>>> +B:
>>> + store float 1.0e+01, float* %add, align 4
>>> + br label %C
>>> +
>>> +C:
>>> + %a.addr.03 = phi float* [ %addb, %A ], [ %add, %B ]
>>> + %b.addr.02 = phi i64 [ %addb.int, %A ], [ %add.int, %B ]
>>> + %tmp = inttoptr i64 %b.addr.02 to float*
>>> +; CHECK: %a.addr.03 = phi
>>> +; CHECK-NEXT: = load
>>> + %tmp1 = load float, float* %tmp, align 4
>>> + %mul.i = fmul float %tmp1, 4.200000e+01
>>> + store float %mul.i, float* %a.addr.03, align 4
>>> + ret void
>>> +}
>>> +
>>> +define void @no_matching_phi(i64 %a, float* %b, i1 %cond) {
>>> +; CHECK-LABEL: @no_matching_phi
>>> +entry:
>>> + %cmp1 = icmp eq i1 %cond, 0
>>> + %add.int = add i64 %a, 1
>>> + %add = inttoptr i64 %add.int to float *
>>> +
>>> + %addb = getelementptr inbounds float, float* %b, i64 2
>>> + %addb.int = ptrtoint float* %addb to i64
>>> + br i1 %cmp1, label %A, label %B
>>> +A:
>>> + br label %C
>>> +B:
>>> + store float 1.0e+01, float* %add, align 4
>>> + br label %C
>>> +
>>> +C:
>>> + %a.addr.03 = phi float* [ %addb, %A ], [ %add, %B ]
>>> + %b.addr.02 = phi i64 [ %addb.int, %B ], [ %add.int, %A ]
>>> + %tmp = inttoptr i64 %b.addr.02 to float*
>>> + %tmp1 = load float, float* %tmp, align 4
>>> +; CHECK: %a.addr.03 = phi
>>> +; CHECK-NEXT: %b.addr.02.ptr = phi
>>> +; CHECK-NEXT: = load
>>> + %mul.i = fmul float %tmp1, 4.200000e+01
>>> + store float %mul.i, float* %a.addr.03, align 4
>>> + ret void
>>> +}
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180517/55ba2566/attachment-0001.html>
More information about the llvm-commits
mailing list