[llvm] r315272 - Renable r314928

Xinliang David Li via llvm-commits llvm-commits at lists.llvm.org
Thu May 17 10:22:01 PDT 2018


I agree in principle. However there is a also advantages of keeping this in
instcombine so creating a separate pass to handle the extreme rare case
seems like an overkill to though I do object to anyone reworking the
optimization in that direction.  For the time being,  I have a patch ready
for review.

thanks,

David

On Thu, May 17, 2018 at 9:08 AM Michael Zolotukhin <mzolotukhin at apple.com>
wrote:

> A limit would definitely help here, but don’t we have a more fundamental
> problem here? I think the root of all evils is that in a local to an
> instruction routine we’re iterating over all instructions in its parent.
> Would it make more sense to convert it to a separate function pass (from an
> instruction visitor)? This way we can probably save some unnecessary
> traversals.
>
> Michael
>
> On May 16, 2018, at 10:06 PM, Xinliang David Li <davidxl at google.com>
> wrote:
>
> Some kind of limit is probably needed. I can take a look at it if you do
> not beat me to it.
>
> David
>
>
> On Wed, May 16, 2018 at 6:57 PM Mikhail Zolotukhin <mzolotukhin at apple.com>
> wrote:
>
>> Hi David,
>>
>> We found a huge compile time regression in an internal clang build caused
>> by this change. I looked at the change and it seems that we’re scanning all
>> phi-nodes in the basic block for every phi, which makes the algorithm
>> quadratic. Of course, it’s not always the case, but as our build showed,
>> it’s quite possible. How can we address it?
>>
>> Here is a simple script to expose the issue:
>>
>>
>> Usage:
>> > for i in `seq 4000 2000 20000`; do echo $i; python genphis.py $i | time
>> opt -instcombine -o /dev/null; done
>>
>> 4000
>>         0.20 real         0.16 user         0.00 sys
>> 6000
>>         0.37 real         0.32 user         0.00 sys
>> 8000
>>         0.98 real         0.92 user         0.01 sys
>> 10000
>>         2.65 real         2.58 user         0.01 sys
>> 12000
>>         3.79 real         3.71 user         0.02 sys
>> 14000
>>         5.08 real         4.99 user         0.02 sys
>> 16000
>>         6.80 real         6.71 user         0.03 sys
>> 18000
>>         8.41 real         8.31 user         0.03 sys
>> 20000
>>        10.43 real        10.31 user         0.04 sys
>>
>>
>> Thanks,
>> Michael
>>
>> On Oct 9, 2017, at 10:07 PM, Xinliang David Li via llvm-commits <
>> llvm-commits at lists.llvm.org> wrote:
>>
>> Author: davidxl
>> Date: Mon Oct  9 22:07:54 2017
>> New Revision: 315272
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=315272&view=rev
>> Log:
>> Renable r314928
>>
>>
>> Eliminate inttype phi with inttoptr/ptrtoint.
>>
>> This version fixed a bug in finding the matching
>> phi -- the order of the incoming blocks may be
>> different (triggered in self build on Windows).
>> A new test case is added.
>>
>> Added:
>>    llvm/trunk/test/Transforms/InstCombine/intptr1.ll
>>      - copied unchanged from r315107,
>> llvm/trunk/test/Transforms/InstCombine/intptr1.ll
>>    llvm/trunk/test/Transforms/InstCombine/intptr2.ll
>>      - copied unchanged from r315107,
>> llvm/trunk/test/Transforms/InstCombine/intptr2.ll
>>    llvm/trunk/test/Transforms/InstCombine/intptr3.ll
>>      - copied unchanged from r315107,
>> llvm/trunk/test/Transforms/InstCombine/intptr3.ll
>>    llvm/trunk/test/Transforms/InstCombine/intptr4.ll
>>      - copied unchanged from r315107,
>> llvm/trunk/test/Transforms/InstCombine/intptr4.ll
>>    llvm/trunk/test/Transforms/InstCombine/intptr5.ll
>>      - copied unchanged from r315107,
>> llvm/trunk/test/Transforms/InstCombine/intptr5.ll
>>    llvm/trunk/test/Transforms/InstCombine/intptr6.ll
>>      - copied unchanged from r315107,
>> llvm/trunk/test/Transforms/InstCombine/intptr6.ll
>>    llvm/trunk/test/Transforms/InstCombine/intptr7.ll
>> Modified:
>>    llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h
>>    llvm/trunk/lib/Transforms/InstCombine/InstCombinePHI.cpp
>>
>> Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h?rev=315272&r1=315271&r2=315272&view=diff
>>
>> ==============================================================================
>> --- llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h (original)
>> +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h Mon Oct
>>  9 22:07:54 2017
>> @@ -670,6 +670,10 @@ private:
>>   Instruction *FoldPHIArgGEPIntoPHI(PHINode &PN);
>>   Instruction *FoldPHIArgLoadIntoPHI(PHINode &PN);
>>   Instruction *FoldPHIArgZextsIntoPHI(PHINode &PN);
>> +  /// If an integer typed PHI has only one use which is an IntToPtr
>> operation,
>> +  /// replace the PHI with an existing pointer typed PHI if it exists.
>> Otherwise
>> +  /// insert a new pointer typed PHI and replace the original one.
>> +  Instruction *FoldIntegerTypedPHI(PHINode &PN);
>>
>>   /// Helper function for FoldPHIArgXIntoPHI() to set debug location for
>> the
>>   /// folded operation.
>>
>> Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombinePHI.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombinePHI.cpp?rev=315272&r1=315271&r2=315272&view=diff
>>
>> ==============================================================================
>> --- llvm/trunk/lib/Transforms/InstCombine/InstCombinePHI.cpp (original)
>> +++ llvm/trunk/lib/Transforms/InstCombine/InstCombinePHI.cpp Mon Oct  9
>> 22:07:54 2017
>> @@ -40,6 +40,238 @@ void InstCombiner::PHIArgMergedDebugLoc(
>>   }
>> }
>>
>> +// Replace Integer typed PHI PN if the PHI's value is used as a pointer
>> value.
>> +// If there is an existing pointer typed PHI that produces the same
>> value as PN,
>> +// replace PN and the IntToPtr operation with it. Otherwise, synthesize
>> a new
>> +// PHI node:
>> +//
>> +// Case-1:
>> +// bb1:
>> +//     int_init = PtrToInt(ptr_init)
>> +//     br label %bb2
>> +// bb2:
>> +//    int_val = PHI([int_init, %bb1], [int_val_inc, %bb2]
>> +//    ptr_val = PHI([ptr_init, %bb1], [ptr_val_inc, %bb2]
>> +//    ptr_val2 = IntToPtr(int_val)
>> +//    ...
>> +//    use(ptr_val2)
>> +//    ptr_val_inc = ...
>> +//    inc_val_inc = PtrToInt(ptr_val_inc)
>> +//
>> +// ==>
>> +// bb1:
>> +//     br label %bb2
>> +// bb2:
>> +//    ptr_val = PHI([ptr_init, %bb1], [ptr_val_inc, %bb2]
>> +//    ...
>> +//    use(ptr_val)
>> +//    ptr_val_inc = ...
>> +//
>> +// Case-2:
>> +// bb1:
>> +//    int_ptr = BitCast(ptr_ptr)
>> +//    int_init = Load(int_ptr)
>> +//    br label %bb2
>> +// bb2:
>> +//    int_val = PHI([int_init, %bb1], [int_val_inc, %bb2]
>> +//    ptr_val2 = IntToPtr(int_val)
>> +//    ...
>> +//    use(ptr_val2)
>> +//    ptr_val_inc = ...
>> +//    inc_val_inc = PtrToInt(ptr_val_inc)
>> +// ==>
>> +// bb1:
>> +//    ptr_init = Load(ptr_ptr)
>> +//    br label %bb2
>> +// bb2:
>> +//    ptr_val = PHI([ptr_init, %bb1], [ptr_val_inc, %bb2]
>> +//    ...
>> +//    use(ptr_val)
>> +//    ptr_val_inc = ...
>> +//    ...
>> +//
>> +Instruction *InstCombiner::FoldIntegerTypedPHI(PHINode &PN) {
>> +  if (!PN.getType()->isIntegerTy())
>> +    return nullptr;
>> +  if (!PN.hasOneUse())
>> +    return nullptr;
>> +
>> +  auto *IntToPtr = dyn_cast<IntToPtrInst>(PN.user_back());
>> +  if (!IntToPtr)
>> +    return nullptr;
>> +
>> +  // Check if the pointer is actually used as pointer:
>> +  auto HasPointerUse = [](Instruction *IIP) {
>> +    for (User *U : IIP->users()) {
>> +      Value *Ptr = nullptr;
>> +      if (LoadInst *LoadI = dyn_cast<LoadInst>(U)) {
>> +        Ptr = LoadI->getPointerOperand();
>> +      } else if (StoreInst *SI = dyn_cast<StoreInst>(U)) {
>> +        Ptr = SI->getPointerOperand();
>> +      } else if (GetElementPtrInst *GI = dyn_cast<GetElementPtrInst>(U))
>> {
>> +        Ptr = GI->getPointerOperand();
>> +      }
>> +
>> +      if (Ptr && Ptr == IIP)
>> +        return true;
>> +    }
>> +    return false;
>> +  };
>> +
>> +  if (!HasPointerUse(IntToPtr))
>> +    return nullptr;
>> +
>> +  if (DL.getPointerSizeInBits(IntToPtr->getAddressSpace()) !=
>> +      DL.getTypeSizeInBits(IntToPtr->getOperand(0)->getType()))
>> +    return nullptr;
>> +
>> +  SmallVector<Value *, 4> AvailablePtrVals;
>> +  for (unsigned i = 0; i != PN.getNumIncomingValues(); ++i) {
>> +    Value *Arg = PN.getIncomingValue(i);
>> +
>> +    // First look backward:
>> +    if (auto *PI = dyn_cast<PtrToIntInst>(Arg)) {
>> +      AvailablePtrVals.emplace_back(PI->getOperand(0));
>> +      continue;
>> +    }
>> +
>> +    // Next look forward:
>> +    Value *ArgIntToPtr = nullptr;
>> +    for (User *U : Arg->users()) {
>> +      if (isa<IntToPtrInst>(U) && U->getType() == IntToPtr->getType() &&
>> +          (DT.dominates(cast<Instruction>(U), PN.getIncomingBlock(i)) ||
>> +           cast<Instruction>(U)->getParent() == PN.getIncomingBlock(i)))
>> {
>> +        ArgIntToPtr = U;
>> +        break;
>> +      }
>> +    }
>> +
>> +    if (ArgIntToPtr) {
>> +      AvailablePtrVals.emplace_back(ArgIntToPtr);
>> +      continue;
>> +    }
>> +
>> +    // If Arg is defined by a PHI, allow it. This will also create
>> +    // more opportunities iteratively.
>> +    if (isa<PHINode>(Arg)) {
>> +      AvailablePtrVals.emplace_back(Arg);
>> +      continue;
>> +    }
>> +
>> +    // For a single use integer load:
>> +    auto *LoadI = dyn_cast<LoadInst>(Arg);
>> +    if (!LoadI)
>> +      return nullptr;
>> +
>> +    if (!LoadI->hasOneUse())
>> +      return nullptr;
>> +
>> +    // Push the integer typed Load instruction into the available
>> +    // value set, and fix it up later when the pointer typed PHI
>> +    // is synthesized.
>> +    AvailablePtrVals.emplace_back(LoadI);
>> +  }
>> +
>> +  // Now search for a matching PHI
>> +  auto *BB = PN.getParent();
>> +  assert(AvailablePtrVals.size() == PN.getNumIncomingValues() &&
>> +         "Not enough available ptr typed incoming values");
>> +  PHINode *MatchingPtrPHI = nullptr;
>> +  for (auto II = BB->begin(), EI =
>> BasicBlock::iterator(BB->getFirstNonPHI());
>> +       II != EI; II++) {
>> +    PHINode *PtrPHI = dyn_cast<PHINode>(II);
>> +    if (!PtrPHI || PtrPHI == &PN || PtrPHI->getType() !=
>> IntToPtr->getType())
>> +      continue;
>> +    MatchingPtrPHI = PtrPHI;
>> +    for (unsigned i = 0; i != PtrPHI->getNumIncomingValues(); ++i) {
>> +      if (AvailablePtrVals[i] !=
>> +          PtrPHI->getIncomingValueForBlock(PN.getIncomingBlock(i))) {
>> +        MatchingPtrPHI = nullptr;
>> +        break;
>> +      }
>> +    }
>> +
>> +    if (MatchingPtrPHI)
>> +      break;
>> +  }
>> +
>> +  if (MatchingPtrPHI) {
>> +    assert(MatchingPtrPHI->getType() == IntToPtr->getType() &&
>> +           "Phi's Type does not match with IntToPtr");
>> +    // The PtrToCast + IntToPtr will be simplified later
>> +    return CastInst::CreateBitOrPointerCast(MatchingPtrPHI,
>> +
>>                                            IntToPtr->getOperand(0)->getType());
>> +  }
>> +
>> +  // If it requires a conversion for every PHI operand, do not do it.
>> +  if (std::all_of(AvailablePtrVals.begin(), AvailablePtrVals.end(),
>> +                  [&](Value *V) {
>> +                    return (V->getType() != IntToPtr->getType()) ||
>> +                           isa<IntToPtrInst>(V);
>> +                  }))
>> +    return nullptr;
>> +
>> +  // If any of the operand that requires casting is a terminator
>> +  // instruction, do not do it.
>> +  if (std::any_of(AvailablePtrVals.begin(), AvailablePtrVals.end(),
>> +                  [&](Value *V) {
>> +                    return (V->getType() != IntToPtr->getType()) &&
>> +                           isa<TerminatorInst>(V);
>> +                  }))
>> +    return nullptr;
>> +
>> +  PHINode *NewPtrPHI = PHINode::Create(
>> +      IntToPtr->getType(), PN.getNumIncomingValues(), PN.getName() +
>> ".ptr");
>> +
>> +  InsertNewInstBefore(NewPtrPHI, PN);
>> +  SmallDenseMap<Value *, Instruction *> Casts;
>> +  for (unsigned i = 0; i != PN.getNumIncomingValues(); ++i) {
>> +    auto *IncomingBB = PN.getIncomingBlock(i);
>> +    auto *IncomingVal = AvailablePtrVals[i];
>> +
>> +    if (IncomingVal->getType() == IntToPtr->getType()) {
>> +      NewPtrPHI->addIncoming(IncomingVal, IncomingBB);
>> +      continue;
>> +    }
>> +
>> +#ifndef NDEBUG
>> +    LoadInst *LoadI = dyn_cast<LoadInst>(IncomingVal);
>> +    assert((isa<PHINode>(IncomingVal) ||
>> +            IncomingVal->getType()->isPointerTy() ||
>> +            (LoadI && LoadI->hasOneUse())) &&
>> +           "Can not replace LoadInst with multiple uses");
>> +#endif
>> +    // Need to insert a BitCast.
>> +    // For an integer Load instruction with a single use, the load +
>> IntToPtr
>> +    // cast will be simplified into a pointer load:
>> +    // %v = load i64, i64* %a.ip, align 8
>> +    // %v.cast = inttoptr i64 %v to float **
>> +    // ==>
>> +    // %v.ptrp = bitcast i64 * %a.ip to float **
>> +    // %v.cast = load float *, float ** %v.ptrp, align 8
>> +    Instruction *&CI = Casts[IncomingVal];
>> +    if (!CI) {
>> +      CI = CastInst::CreateBitOrPointerCast(IncomingVal,
>> IntToPtr->getType(),
>> +                                            IncomingVal->getName() +
>> ".ptr");
>> +      if (auto *IncomingI = dyn_cast<Instruction>(IncomingVal)) {
>> +        BasicBlock::iterator InsertPos(IncomingI);
>> +        InsertPos++;
>> +        if (isa<PHINode>(IncomingI))
>> +          InsertPos = IncomingI->getParent()->getFirstInsertionPt();
>> +        InsertNewInstBefore(CI, *InsertPos);
>> +      } else {
>> +        auto *InsertBB = &IncomingBB->getParent()->getEntryBlock();
>> +        InsertNewInstBefore(CI, *InsertBB->getFirstInsertionPt());
>> +      }
>> +    }
>> +    NewPtrPHI->addIncoming(CI, IncomingBB);
>> +  }
>> +
>> +  // The PtrToCast + IntToPtr will be simplified later
>> +  return CastInst::CreateBitOrPointerCast(NewPtrPHI,
>> +
>>                                          IntToPtr->getOperand(0)->getType());
>> +}
>> +
>> /// If we have something like phi [add (a,b), add(a,c)] and if a/b/c and
>> the
>> /// adds all have a single use, turn this into a phi and a single binop.
>> Instruction *InstCombiner::FoldPHIArgBinOpIntoPHI(PHINode &PN) {
>> @@ -903,6 +1135,9 @@ Instruction *InstCombiner::visitPHINode(
>>   // this PHI only has a single use (a PHI), and if that PHI only has one
>> use (a
>>   // PHI)... break the cycle.
>>   if (PN.hasOneUse()) {
>> +    if (Instruction *Result = FoldIntegerTypedPHI(PN))
>> +      return Result;
>> +
>>     Instruction *PHIUser = cast<Instruction>(PN.user_back());
>>     if (PHINode *PU = dyn_cast<PHINode>(PHIUser)) {
>>       SmallPtrSet<PHINode*, 16> PotentiallyDeadPHIs;
>>
>> Added: llvm/trunk/test/Transforms/InstCombine/intptr7.ll
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/intptr7.ll?rev=315272&view=auto
>>
>> ==============================================================================
>> --- llvm/trunk/test/Transforms/InstCombine/intptr7.ll (added)
>> +++ llvm/trunk/test/Transforms/InstCombine/intptr7.ll Mon Oct  9 22:07:54
>> 2017
>> @@ -0,0 +1,58 @@
>> +; RUN: opt < %s  -instcombine -S | FileCheck %s
>> +
>> +define void @matching_phi(i64 %a, float* %b, i1 %cond) {
>> +; CHECK-LABEL: @matching_phi
>> +entry:
>> +  %cmp1 = icmp  eq i1 %cond, 0
>> +  %add.int = add i64 %a, 1
>> +  %add = inttoptr i64 %add.int to float *
>> +
>> +  %addb = getelementptr inbounds float, float* %b, i64 2
>> +  %addb.int = ptrtoint float* %addb to i64
>> +  br i1 %cmp1, label %A, label %B
>> +A:
>> +  br label %C
>> +B:
>> +  store float 1.0e+01, float* %add, align 4
>> +  br label %C
>> +
>> +C:
>> +  %a.addr.03 = phi float* [ %addb, %A ], [ %add, %B ]
>> +  %b.addr.02 = phi i64 [ %addb.int, %A ], [ %add.int, %B ]
>> +  %tmp = inttoptr i64 %b.addr.02 to float*
>> +; CHECK: %a.addr.03 = phi
>> +; CHECK-NEXT: = load
>> +  %tmp1 = load float, float* %tmp, align 4
>> +  %mul.i = fmul float %tmp1, 4.200000e+01
>> +  store float %mul.i, float* %a.addr.03, align 4
>> +  ret void
>> +}
>> +
>> +define void @no_matching_phi(i64 %a, float* %b, i1 %cond) {
>> +; CHECK-LABEL: @no_matching_phi
>> +entry:
>> +  %cmp1 = icmp  eq i1 %cond, 0
>> +  %add.int = add i64 %a, 1
>> +  %add = inttoptr i64 %add.int to float *
>> +
>> +  %addb = getelementptr inbounds float, float* %b, i64 2
>> +  %addb.int = ptrtoint float* %addb to i64
>> +  br i1 %cmp1, label %A, label %B
>> +A:
>> +  br label %C
>> +B:
>> +  store float 1.0e+01, float* %add, align 4
>> +  br label %C
>> +
>> +C:
>> +  %a.addr.03 = phi float* [ %addb, %A ], [ %add, %B ]
>> +  %b.addr.02 = phi i64 [ %addb.int, %B ], [ %add.int, %A ]
>> +  %tmp = inttoptr i64 %b.addr.02 to float*
>> +  %tmp1 = load float, float* %tmp, align 4
>> +; CHECK: %a.addr.03 = phi
>> +; CHECK-NEXT: %b.addr.02.ptr = phi
>> +; CHECK-NEXT: = load
>> +  %mul.i = fmul float %tmp1, 4.200000e+01
>> +  store float %mul.i, float* %a.addr.03, align 4
>> +  ret void
>> +}
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180517/2ee2e320/attachment.html>


More information about the llvm-commits mailing list