[llvm] r315272 - Renable r314928

Xinliang David Li via llvm-commits llvm-commits at lists.llvm.org
Thu May 17 10:46:46 PDT 2018


Sounds good to me.  I will drop my two line patch then.

On Thu, May 17, 2018 at 10:33 AM Michael Zolotukhin <mzolotukhin at apple.com>
wrote:

>
>
> On May 17, 2018, at 10:22 AM, Xinliang David Li <davidxl at google.com>
> wrote:
>
> I agree in principle. However there is a also advantages of keeping this
> in instcombine so creating a separate pass to handle the extreme rare case
> seems like an overkill to though I do object to anyone reworking the
> optimization in that direction.
>
> I agree a separate pass would be an overkill, but I was thinking more in
> line of AggressiveInstCombine, which seems to fit perfectly for this case.
> WDYT?
>
>   For the time being,  I have a patch ready for review.
>
> Thanks!
>
> Michael
>
>
> thanks,
>
> David
>
> On Thu, May 17, 2018 at 9:08 AM Michael Zolotukhin <mzolotukhin at apple.com>
> wrote:
>
>> A limit would definitely help here, but don’t we have a more fundamental
>> problem here? I think the root of all evils is that in a local to an
>> instruction routine we’re iterating over all instructions in its parent.
>> Would it make more sense to convert it to a separate function pass (from an
>> instruction visitor)? This way we can probably save some unnecessary
>> traversals.
>>
>> Michael
>>
>> On May 16, 2018, at 10:06 PM, Xinliang David Li <davidxl at google.com>
>> wrote:
>>
>> Some kind of limit is probably needed. I can take a look at it if you do
>> not beat me to it.
>>
>> David
>>
>>
>> On Wed, May 16, 2018 at 6:57 PM Mikhail Zolotukhin <mzolotukhin at apple.com>
>> wrote:
>>
>>> Hi David,
>>>
>>> We found a huge compile time regression in an internal clang build
>>> caused by this change. I looked at the change and it seems that we’re
>>> scanning all phi-nodes in the basic block for every phi, which makes the
>>> algorithm quadratic. Of course, it’s not always the case, but as our build
>>> showed, it’s quite possible. How can we address it?
>>>
>>> Here is a simple script to expose the issue:
>>>
>>>
>>> Usage:
>>> > for i in `seq 4000 2000 20000`; do echo $i; python genphis.py $i |
>>> time opt -instcombine -o /dev/null; done
>>>
>>> 4000
>>>         0.20 real         0.16 user         0.00 sys
>>> 6000
>>>         0.37 real         0.32 user         0.00 sys
>>> 8000
>>>         0.98 real         0.92 user         0.01 sys
>>> 10000
>>>         2.65 real         2.58 user         0.01 sys
>>> 12000
>>>         3.79 real         3.71 user         0.02 sys
>>> 14000
>>>         5.08 real         4.99 user         0.02 sys
>>> 16000
>>>         6.80 real         6.71 user         0.03 sys
>>> 18000
>>>         8.41 real         8.31 user         0.03 sys
>>> 20000
>>>        10.43 real        10.31 user         0.04 sys
>>>
>>>
>>> Thanks,
>>> Michael
>>>
>>> On Oct 9, 2017, at 10:07 PM, Xinliang David Li via llvm-commits <
>>> llvm-commits at lists.llvm.org> wrote:
>>>
>>> Author: davidxl
>>> Date: Mon Oct  9 22:07:54 2017
>>> New Revision: 315272
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=315272&view=rev
>>> Log:
>>> Renable r314928
>>>
>>>
>>> Eliminate inttype phi with inttoptr/ptrtoint.
>>>
>>> This version fixed a bug in finding the matching
>>> phi -- the order of the incoming blocks may be
>>> different (triggered in self build on Windows).
>>> A new test case is added.
>>>
>>> Added:
>>>    llvm/trunk/test/Transforms/InstCombine/intptr1.ll
>>>      - copied unchanged from r315107,
>>> llvm/trunk/test/Transforms/InstCombine/intptr1.ll
>>>    llvm/trunk/test/Transforms/InstCombine/intptr2.ll
>>>      - copied unchanged from r315107,
>>> llvm/trunk/test/Transforms/InstCombine/intptr2.ll
>>>    llvm/trunk/test/Transforms/InstCombine/intptr3.ll
>>>      - copied unchanged from r315107,
>>> llvm/trunk/test/Transforms/InstCombine/intptr3.ll
>>>    llvm/trunk/test/Transforms/InstCombine/intptr4.ll
>>>      - copied unchanged from r315107,
>>> llvm/trunk/test/Transforms/InstCombine/intptr4.ll
>>>    llvm/trunk/test/Transforms/InstCombine/intptr5.ll
>>>      - copied unchanged from r315107,
>>> llvm/trunk/test/Transforms/InstCombine/intptr5.ll
>>>    llvm/trunk/test/Transforms/InstCombine/intptr6.ll
>>>      - copied unchanged from r315107,
>>> llvm/trunk/test/Transforms/InstCombine/intptr6.ll
>>>    llvm/trunk/test/Transforms/InstCombine/intptr7.ll
>>> Modified:
>>>    llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h
>>>    llvm/trunk/lib/Transforms/InstCombine/InstCombinePHI.cpp
>>>
>>> Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h?rev=315272&r1=315271&r2=315272&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h
>>> (original)
>>> +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h Mon Oct
>>>  9 22:07:54 2017
>>> @@ -670,6 +670,10 @@ private:
>>>   Instruction *FoldPHIArgGEPIntoPHI(PHINode &PN);
>>>   Instruction *FoldPHIArgLoadIntoPHI(PHINode &PN);
>>>   Instruction *FoldPHIArgZextsIntoPHI(PHINode &PN);
>>> +  /// If an integer typed PHI has only one use which is an IntToPtr
>>> operation,
>>> +  /// replace the PHI with an existing pointer typed PHI if it exists.
>>> Otherwise
>>> +  /// insert a new pointer typed PHI and replace the original one.
>>> +  Instruction *FoldIntegerTypedPHI(PHINode &PN);
>>>
>>>   /// Helper function for FoldPHIArgXIntoPHI() to set debug location for
>>> the
>>>   /// folded operation.
>>>
>>> Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombinePHI.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombinePHI.cpp?rev=315272&r1=315271&r2=315272&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/Transforms/InstCombine/InstCombinePHI.cpp (original)
>>> +++ llvm/trunk/lib/Transforms/InstCombine/InstCombinePHI.cpp Mon Oct  9
>>> 22:07:54 2017
>>> @@ -40,6 +40,238 @@ void InstCombiner::PHIArgMergedDebugLoc(
>>>   }
>>> }
>>>
>>> +// Replace Integer typed PHI PN if the PHI's value is used as a pointer
>>> value.
>>> +// If there is an existing pointer typed PHI that produces the same
>>> value as PN,
>>> +// replace PN and the IntToPtr operation with it. Otherwise, synthesize
>>> a new
>>> +// PHI node:
>>> +//
>>> +// Case-1:
>>> +// bb1:
>>> +//     int_init = PtrToInt(ptr_init)
>>> +//     br label %bb2
>>> +// bb2:
>>> +//    int_val = PHI([int_init, %bb1], [int_val_inc, %bb2]
>>> +//    ptr_val = PHI([ptr_init, %bb1], [ptr_val_inc, %bb2]
>>> +//    ptr_val2 = IntToPtr(int_val)
>>> +//    ...
>>> +//    use(ptr_val2)
>>> +//    ptr_val_inc = ...
>>> +//    inc_val_inc = PtrToInt(ptr_val_inc)
>>> +//
>>> +// ==>
>>> +// bb1:
>>> +//     br label %bb2
>>> +// bb2:
>>> +//    ptr_val = PHI([ptr_init, %bb1], [ptr_val_inc, %bb2]
>>> +//    ...
>>> +//    use(ptr_val)
>>> +//    ptr_val_inc = ...
>>> +//
>>> +// Case-2:
>>> +// bb1:
>>> +//    int_ptr = BitCast(ptr_ptr)
>>> +//    int_init = Load(int_ptr)
>>> +//    br label %bb2
>>> +// bb2:
>>> +//    int_val = PHI([int_init, %bb1], [int_val_inc, %bb2]
>>> +//    ptr_val2 = IntToPtr(int_val)
>>> +//    ...
>>> +//    use(ptr_val2)
>>> +//    ptr_val_inc = ...
>>> +//    inc_val_inc = PtrToInt(ptr_val_inc)
>>> +// ==>
>>> +// bb1:
>>> +//    ptr_init = Load(ptr_ptr)
>>> +//    br label %bb2
>>> +// bb2:
>>> +//    ptr_val = PHI([ptr_init, %bb1], [ptr_val_inc, %bb2]
>>> +//    ...
>>> +//    use(ptr_val)
>>> +//    ptr_val_inc = ...
>>> +//    ...
>>> +//
>>> +Instruction *InstCombiner::FoldIntegerTypedPHI(PHINode &PN) {
>>> +  if (!PN.getType()->isIntegerTy())
>>> +    return nullptr;
>>> +  if (!PN.hasOneUse())
>>> +    return nullptr;
>>> +
>>> +  auto *IntToPtr = dyn_cast<IntToPtrInst>(PN.user_back());
>>> +  if (!IntToPtr)
>>> +    return nullptr;
>>> +
>>> +  // Check if the pointer is actually used as pointer:
>>> +  auto HasPointerUse = [](Instruction *IIP) {
>>> +    for (User *U : IIP->users()) {
>>> +      Value *Ptr = nullptr;
>>> +      if (LoadInst *LoadI = dyn_cast<LoadInst>(U)) {
>>> +        Ptr = LoadI->getPointerOperand();
>>> +      } else if (StoreInst *SI = dyn_cast<StoreInst>(U)) {
>>> +        Ptr = SI->getPointerOperand();
>>> +      } else if (GetElementPtrInst *GI =
>>> dyn_cast<GetElementPtrInst>(U)) {
>>> +        Ptr = GI->getPointerOperand();
>>> +      }
>>> +
>>> +      if (Ptr && Ptr == IIP)
>>> +        return true;
>>> +    }
>>> +    return false;
>>> +  };
>>> +
>>> +  if (!HasPointerUse(IntToPtr))
>>> +    return nullptr;
>>> +
>>> +  if (DL.getPointerSizeInBits(IntToPtr->getAddressSpace()) !=
>>> +      DL.getTypeSizeInBits(IntToPtr->getOperand(0)->getType()))
>>> +    return nullptr;
>>> +
>>> +  SmallVector<Value *, 4> AvailablePtrVals;
>>> +  for (unsigned i = 0; i != PN.getNumIncomingValues(); ++i) {
>>> +    Value *Arg = PN.getIncomingValue(i);
>>> +
>>> +    // First look backward:
>>> +    if (auto *PI = dyn_cast<PtrToIntInst>(Arg)) {
>>> +      AvailablePtrVals.emplace_back(PI->getOperand(0));
>>> +      continue;
>>> +    }
>>> +
>>> +    // Next look forward:
>>> +    Value *ArgIntToPtr = nullptr;
>>> +    for (User *U : Arg->users()) {
>>> +      if (isa<IntToPtrInst>(U) && U->getType() == IntToPtr->getType() &&
>>> +          (DT.dominates(cast<Instruction>(U), PN.getIncomingBlock(i)) ||
>>> +           cast<Instruction>(U)->getParent() ==
>>> PN.getIncomingBlock(i))) {
>>> +        ArgIntToPtr = U;
>>> +        break;
>>> +      }
>>> +    }
>>> +
>>> +    if (ArgIntToPtr) {
>>> +      AvailablePtrVals.emplace_back(ArgIntToPtr);
>>> +      continue;
>>> +    }
>>> +
>>> +    // If Arg is defined by a PHI, allow it. This will also create
>>> +    // more opportunities iteratively.
>>> +    if (isa<PHINode>(Arg)) {
>>> +      AvailablePtrVals.emplace_back(Arg);
>>> +      continue;
>>> +    }
>>> +
>>> +    // For a single use integer load:
>>> +    auto *LoadI = dyn_cast<LoadInst>(Arg);
>>> +    if (!LoadI)
>>> +      return nullptr;
>>> +
>>> +    if (!LoadI->hasOneUse())
>>> +      return nullptr;
>>> +
>>> +    // Push the integer typed Load instruction into the available
>>> +    // value set, and fix it up later when the pointer typed PHI
>>> +    // is synthesized.
>>> +    AvailablePtrVals.emplace_back(LoadI);
>>> +  }
>>> +
>>> +  // Now search for a matching PHI
>>> +  auto *BB = PN.getParent();
>>> +  assert(AvailablePtrVals.size() == PN.getNumIncomingValues() &&
>>> +         "Not enough available ptr typed incoming values");
>>> +  PHINode *MatchingPtrPHI = nullptr;
>>> +  for (auto II = BB->begin(), EI =
>>> BasicBlock::iterator(BB->getFirstNonPHI());
>>> +       II != EI; II++) {
>>> +    PHINode *PtrPHI = dyn_cast<PHINode>(II);
>>> +    if (!PtrPHI || PtrPHI == &PN || PtrPHI->getType() !=
>>> IntToPtr->getType())
>>> +      continue;
>>> +    MatchingPtrPHI = PtrPHI;
>>> +    for (unsigned i = 0; i != PtrPHI->getNumIncomingValues(); ++i) {
>>> +      if (AvailablePtrVals[i] !=
>>> +          PtrPHI->getIncomingValueForBlock(PN.getIncomingBlock(i))) {
>>> +        MatchingPtrPHI = nullptr;
>>> +        break;
>>> +      }
>>> +    }
>>> +
>>> +    if (MatchingPtrPHI)
>>> +      break;
>>> +  }
>>> +
>>> +  if (MatchingPtrPHI) {
>>> +    assert(MatchingPtrPHI->getType() == IntToPtr->getType() &&
>>> +           "Phi's Type does not match with IntToPtr");
>>> +    // The PtrToCast + IntToPtr will be simplified later
>>> +    return CastInst::CreateBitOrPointerCast(MatchingPtrPHI,
>>> +
>>>                                            IntToPtr->getOperand(0)->getType());
>>> +  }
>>> +
>>> +  // If it requires a conversion for every PHI operand, do not do it.
>>> +  if (std::all_of(AvailablePtrVals.begin(), AvailablePtrVals.end(),
>>> +                  [&](Value *V) {
>>> +                    return (V->getType() != IntToPtr->getType()) ||
>>> +                           isa<IntToPtrInst>(V);
>>> +                  }))
>>> +    return nullptr;
>>> +
>>> +  // If any of the operand that requires casting is a terminator
>>> +  // instruction, do not do it.
>>> +  if (std::any_of(AvailablePtrVals.begin(), AvailablePtrVals.end(),
>>> +                  [&](Value *V) {
>>> +                    return (V->getType() != IntToPtr->getType()) &&
>>> +                           isa<TerminatorInst>(V);
>>> +                  }))
>>> +    return nullptr;
>>> +
>>> +  PHINode *NewPtrPHI = PHINode::Create(
>>> +      IntToPtr->getType(), PN.getNumIncomingValues(), PN.getName() +
>>> ".ptr");
>>> +
>>> +  InsertNewInstBefore(NewPtrPHI, PN);
>>> +  SmallDenseMap<Value *, Instruction *> Casts;
>>> +  for (unsigned i = 0; i != PN.getNumIncomingValues(); ++i) {
>>> +    auto *IncomingBB = PN.getIncomingBlock(i);
>>> +    auto *IncomingVal = AvailablePtrVals[i];
>>> +
>>> +    if (IncomingVal->getType() == IntToPtr->getType()) {
>>> +      NewPtrPHI->addIncoming(IncomingVal, IncomingBB);
>>> +      continue;
>>> +    }
>>> +
>>> +#ifndef NDEBUG
>>> +    LoadInst *LoadI = dyn_cast<LoadInst>(IncomingVal);
>>> +    assert((isa<PHINode>(IncomingVal) ||
>>> +            IncomingVal->getType()->isPointerTy() ||
>>> +            (LoadI && LoadI->hasOneUse())) &&
>>> +           "Can not replace LoadInst with multiple uses");
>>> +#endif
>>> +    // Need to insert a BitCast.
>>> +    // For an integer Load instruction with a single use, the load +
>>> IntToPtr
>>> +    // cast will be simplified into a pointer load:
>>> +    // %v = load i64, i64* %a.ip, align 8
>>> +    // %v.cast = inttoptr i64 %v to float **
>>> +    // ==>
>>> +    // %v.ptrp = bitcast i64 * %a.ip to float **
>>> +    // %v.cast = load float *, float ** %v.ptrp, align 8
>>> +    Instruction *&CI = Casts[IncomingVal];
>>> +    if (!CI) {
>>> +      CI = CastInst::CreateBitOrPointerCast(IncomingVal,
>>> IntToPtr->getType(),
>>> +                                            IncomingVal->getName() +
>>> ".ptr");
>>> +      if (auto *IncomingI = dyn_cast<Instruction>(IncomingVal)) {
>>> +        BasicBlock::iterator InsertPos(IncomingI);
>>> +        InsertPos++;
>>> +        if (isa<PHINode>(IncomingI))
>>> +          InsertPos = IncomingI->getParent()->getFirstInsertionPt();
>>> +        InsertNewInstBefore(CI, *InsertPos);
>>> +      } else {
>>> +        auto *InsertBB = &IncomingBB->getParent()->getEntryBlock();
>>> +        InsertNewInstBefore(CI, *InsertBB->getFirstInsertionPt());
>>> +      }
>>> +    }
>>> +    NewPtrPHI->addIncoming(CI, IncomingBB);
>>> +  }
>>> +
>>> +  // The PtrToCast + IntToPtr will be simplified later
>>> +  return CastInst::CreateBitOrPointerCast(NewPtrPHI,
>>> +
>>>                                          IntToPtr->getOperand(0)->getType());
>>> +}
>>> +
>>> /// If we have something like phi [add (a,b), add(a,c)] and if a/b/c and
>>> the
>>> /// adds all have a single use, turn this into a phi and a single binop.
>>> Instruction *InstCombiner::FoldPHIArgBinOpIntoPHI(PHINode &PN) {
>>> @@ -903,6 +1135,9 @@ Instruction *InstCombiner::visitPHINode(
>>>   // this PHI only has a single use (a PHI), and if that PHI only has
>>> one use (a
>>>   // PHI)... break the cycle.
>>>   if (PN.hasOneUse()) {
>>> +    if (Instruction *Result = FoldIntegerTypedPHI(PN))
>>> +      return Result;
>>> +
>>>     Instruction *PHIUser = cast<Instruction>(PN.user_back());
>>>     if (PHINode *PU = dyn_cast<PHINode>(PHIUser)) {
>>>       SmallPtrSet<PHINode*, 16> PotentiallyDeadPHIs;
>>>
>>> Added: llvm/trunk/test/Transforms/InstCombine/intptr7.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/intptr7.ll?rev=315272&view=auto
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/Transforms/InstCombine/intptr7.ll (added)
>>> +++ llvm/trunk/test/Transforms/InstCombine/intptr7.ll Mon Oct  9
>>> 22:07:54 2017
>>> @@ -0,0 +1,58 @@
>>> +; RUN: opt < %s  -instcombine -S | FileCheck %s
>>> +
>>> +define void @matching_phi(i64 %a, float* %b, i1 %cond) {
>>> +; CHECK-LABEL: @matching_phi
>>> +entry:
>>> +  %cmp1 = icmp  eq i1 %cond, 0
>>> +  %add.int = add i64 %a, 1
>>> +  %add = inttoptr i64 %add.int to float *
>>> +
>>> +  %addb = getelementptr inbounds float, float* %b, i64 2
>>> +  %addb.int = ptrtoint float* %addb to i64
>>> +  br i1 %cmp1, label %A, label %B
>>> +A:
>>> +  br label %C
>>> +B:
>>> +  store float 1.0e+01, float* %add, align 4
>>> +  br label %C
>>> +
>>> +C:
>>> +  %a.addr.03 = phi float* [ %addb, %A ], [ %add, %B ]
>>> +  %b.addr.02 = phi i64 [ %addb.int, %A ], [ %add.int, %B ]
>>> +  %tmp = inttoptr i64 %b.addr.02 to float*
>>> +; CHECK: %a.addr.03 = phi
>>> +; CHECK-NEXT: = load
>>> +  %tmp1 = load float, float* %tmp, align 4
>>> +  %mul.i = fmul float %tmp1, 4.200000e+01
>>> +  store float %mul.i, float* %a.addr.03, align 4
>>> +  ret void
>>> +}
>>> +
>>> +define void @no_matching_phi(i64 %a, float* %b, i1 %cond) {
>>> +; CHECK-LABEL: @no_matching_phi
>>> +entry:
>>> +  %cmp1 = icmp  eq i1 %cond, 0
>>> +  %add.int = add i64 %a, 1
>>> +  %add = inttoptr i64 %add.int to float *
>>> +
>>> +  %addb = getelementptr inbounds float, float* %b, i64 2
>>> +  %addb.int = ptrtoint float* %addb to i64
>>> +  br i1 %cmp1, label %A, label %B
>>> +A:
>>> +  br label %C
>>> +B:
>>> +  store float 1.0e+01, float* %add, align 4
>>> +  br label %C
>>> +
>>> +C:
>>> +  %a.addr.03 = phi float* [ %addb, %A ], [ %add, %B ]
>>> +  %b.addr.02 = phi i64 [ %addb.int, %B ], [ %add.int, %A ]
>>> +  %tmp = inttoptr i64 %b.addr.02 to float*
>>> +  %tmp1 = load float, float* %tmp, align 4
>>> +; CHECK: %a.addr.03 = phi
>>> +; CHECK-NEXT: %b.addr.02.ptr = phi
>>> +; CHECK-NEXT: = load
>>> +  %mul.i = fmul float %tmp1, 4.200000e+01
>>> +  store float %mul.i, float* %a.addr.03, align 4
>>> +  ret void
>>> +}
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180517/55ba2566/attachment-0001.html>


More information about the llvm-commits mailing list