[llvm] r346397 - Return "[IndVars] Smart hard uses detection"

Maxim Kazantsev via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 15 01:45:45 PST 2018


Thanks! I'll try to find some time for it within next few days. It may be a symptom of a general improvement we can make, because old "better" behavior was just accidental. :)

-- Max

-----Original Message-----
From: Mikael Holmén <mikael.holmen at ericsson.com> 
Sent: Thursday, November 15, 2018 4:32 PM
To: Maxim Kazantsev <max.kazantsev at azul.com>
Cc: llvm-commits at lists.llvm.org
Subject: Re: [llvm] r346397 - Return "[IndVars] Smart hard uses detection"



On 11/15/18 10:19 AM, Maxim Kazantsev wrote:
> Hi Mikael,
> 
> That's a really weird case, in fact we have a constant in the end but for some reason it was expanded as add of two constants... I wonder why SCEV was unable to understand that it was a constant. It looks more like a SCEV bug than a bug in this particular algorithm. If this worries you, feel free to revert the patch, otherwise please file a bug and I'll take a look into it when I have some time.
> 

Ok, I wrote
  https://bugs.llvm.org/show_bug.cgi?id=39673
about it.

I suppose something is getting confused since one part of the "constant" 
comes from the first loop and one part from the second.

I noticed this when debugging another problem and saw something that previously was reduced to a constant and now suddenly wasn't anymore, so I haven't seen any benchmarks or real customer code that suffers from it (yet).

Regards,
Mikael

> Thanks,
> Max
> 
> -----Original Message-----
> From: Mikael Holmén <mikael.holmen at ericsson.com>
> Sent: Thursday, November 15, 2018 4:14 PM
> To: Maxim Kazantsev <max.kazantsev at azul.com>
> Cc: llvm-commits at lists.llvm.org
> Subject: Re: [llvm] r346397 - Return "[IndVars] Smart hard uses detection"
> 
> Hi again Max,
> 
> I found another case that got a bit worse even with the improved patch that handles constants better.
> 
> If I run
> 
>    opt -S -o - foo2.ll -indvars
> 
> with the patch I get
> 
> define i16 @test10() {
> entry:
>     br label %loop1
> 
> loop1:                                            ; preds = %loop1, %entry
>     %l1 = phi i16 [ 0, %entry ], [ %l1.add, %loop1 ]
>     %l1.add = add nuw nsw i16 %l1, 1
>     %cmp1 = icmp ult i16 %l1.add, 2
>     br i1 %cmp1, label %loop1, label %loop2.preheader
> 
> loop2.preheader:                                  ; preds = %loop1
>     br label %loop2
> 
> loop2:                                            ; preds = %loop2,
> %loop2.preheader
>     %k2 = phi i16 [ %k2.add, %loop2 ], [ 182, %loop2.preheader ]
>     %l2 = phi i16 [ %l2.add, %loop2 ], [ 0, %loop2.preheader ]
>     %l2.add = add nuw nsw i16 %l2, 1
>     tail call void @foo(i16 %k2)
>     %k2.add = add nuw nsw i16 %k2, 1
>     %cmp2 = icmp ult i16 %l2.add, 2
>     br i1 %cmp2, label %loop2, label %loop2.end
> 
> loop2.end:                                        ; preds = %loop2
>     %k2.add.lcssa = phi i16 [ %k2.add, %loop2 ]
>     ret i16 %k2.add.lcssa
> }
> 
> and without it:
> 
> define i16 @test10() {
> entry:
>     br label %loop1
> 
> loop1:                                            ; preds = %loop1, %entry
>     %l1 = phi i16 [ 0, %entry ], [ %l1.add, %loop1 ]
>     %l1.add = add nuw nsw i16 %l1, 1
>     %cmp1 = icmp ult i16 %l1.add, 2
>     br i1 %cmp1, label %loop1, label %loop2.preheader
> 
> loop2.preheader:                                  ; preds = %loop1
>     br label %loop2
> 
> loop2:                                            ; preds = %loop2,
> %loop2.preheader
>     %k2 = phi i16 [ %k2.add, %loop2 ], [ 182, %loop2.preheader ]
>     %l2 = phi i16 [ %l2.add, %loop2 ], [ 0, %loop2.preheader ]
>     %l2.add = add nuw nsw i16 %l2, 1
>     tail call void @foo(i16 %k2)
>     %k2.add = add nuw nsw i16 %k2, 1
>     %cmp2 = icmp ult i16 %l2.add, 2
>     br i1 %cmp2, label %loop2, label %loop2.end
> 
> loop2.end:                                        ; preds = %loop2
>     %0 = add i16 182, 2
>     ret i16 %0
> }
> 
> Note the difference in how the return value is calculated.
> 
> I suppose that indvars doesn't consider this to be the constant case that should always be allowed, but I think it's quite unfortunate if this simplification isn't done.
> 
> Regards,
> Mikael
> 
> On 11/8/18 12:54 PM, Max Kazantsev via llvm-commits wrote:
>> Author: mkazantsev
>> Date: Thu Nov  8 03:54:35 2018
>> New Revision: 346397
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=346397&view=rev
>> Log:
>> Return "[IndVars] Smart hard uses detection"
>>
>> The patch has been reverted because it ended up prohibiting 
>> propagation of a constant to exit value. For such values, we should 
>> skip all checks related to hard uses because propagating a constant is always profitable.
>>
>> Differential Revision: https://reviews.llvm.org/D53691
>>
>> Modified:
>>       llvm/trunk/lib/Transforms/Scalar/IndVarSimplify.cpp
>>       llvm/trunk/test/Analysis/ScalarEvolution/pr28705.ll
>>       llvm/trunk/test/Transforms/IndVarSimplify/dont-recompute.ll
>>       llvm/trunk/test/Transforms/IndVarSimplify/lrev-existing-umin.ll
>>
>> Modified: llvm/trunk/lib/Transforms/Scalar/IndVarSimplify.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/
>> I ndVarSimplify.cpp?rev=346397&r1=346396&r2=346397&view=diff
>> =====================================================================
>> =
>> ========
>> --- llvm/trunk/lib/Transforms/Scalar/IndVarSimplify.cpp (original)
>> +++ llvm/trunk/lib/Transforms/Scalar/IndVarSimplify.cpp Thu Nov  8
>> +++ 03:54:35 2018
>> @@ -145,6 +145,7 @@ class IndVarSimplify {
>>      bool canLoopBeDeleted(Loop *L, SmallVector<RewritePhi, 8> &RewritePhiSet);
>>      bool rewriteLoopExitValues(Loop *L, SCEVExpander &Rewriter);
>>      bool rewriteFirstIterationLoopExitValues(Loop *L);
>> +  bool hasHardUserWithinLoop(const Loop *L, const Instruction *I) 
>> + const;
>>    
>>      bool linearFunctionTestReplace(Loop *L, const SCEV *BackedgeTakenCount,
>>                                     PHINode *IndVar, SCEVExpander 
>> &Rewriter); @@ -524,6 +525,29 @@ struct RewritePhi {
>>    // As a side effect, reduces the amount of IV processing within the loop.
>>    
>> //===----------------------------------------------------------------
>> -
>> -----===//
>>    
>> +bool IndVarSimplify::hasHardUserWithinLoop(const Loop *L, const 
>> +Instruction *I) const {
>> +  SmallPtrSet<const Instruction *, 8> Visited;
>> +  SmallVector<const Instruction *, 8> WorkList;
>> +  Visited.insert(I);
>> +  WorkList.push_back(I);
>> +  while (!WorkList.empty()) {
>> +    const Instruction *Curr = WorkList.pop_back_val();
>> +    // This use is outside the loop, nothing to do.
>> +    if (!L->contains(Curr))
>> +      continue;
>> +    // Do we assume it is a "hard" use which will not be eliminated easily?
>> +    if (Curr->mayHaveSideEffects())
>> +      return true;
>> +    // Otherwise, add all its users to worklist.
>> +    for (auto U : Curr->users()) {
>> +      auto *UI = cast<Instruction>(U);
>> +      if (Visited.insert(UI).second)
>> +        WorkList.push_back(UI);
>> +    }
>> +  }
>> +  return false;
>> +}
>> +
>>    /// Check to see if this loop has a computable loop-invariant execution count.
>>    /// If so, this means that we can compute the final value of any expressions
>>    /// that are recurrent in the loop, and substitute the exit values 
>> from the loop @@ -598,19 +622,8 @@ bool IndVarSimplify::rewriteLoopExitValu
>>            // Computing the value outside of the loop brings no benefit if it is
>>            // definitely used inside the loop in a way which can not be optimized
>>            // away.
>> -        if (ExitValue->getSCEVType()>=scMulExpr) {
>> -          bool HasHardInternalUses = false;
>> -          for (auto *IB : Inst->users()) {
>> -            Instruction *UseInstr = cast<Instruction>(IB);
>> -            unsigned Opc = UseInstr->getOpcode();
>> -            if (L->contains(UseInstr) && Opc == Instruction::Call) {
>> -              HasHardInternalUses = true;
>> -              break;
>> -            }
>> -          }
>> -          if (HasHardInternalUses)
>> -            continue;
>> -        }
>> +        if (!isa<SCEVConstant>(ExitValue) && hasHardUserWithinLoop(L, Inst))
>> +          continue;
>>    
>>            bool HighCost = Rewriter.isHighCostExpansion(ExitValue, L, Inst);
>>            Value *ExitVal = Rewriter.expandCodeFor(ExitValue,
>> PN->getType(), Inst);
>>
>> Modified: llvm/trunk/test/Analysis/ScalarEvolution/pr28705.ll
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/ScalarEv
>> o lution/pr28705.ll?rev=346397&r1=346396&r2=346397&view=diff
>> =====================================================================
>> =
>> ========
>> --- llvm/trunk/test/Analysis/ScalarEvolution/pr28705.ll (original)
>> +++ llvm/trunk/test/Analysis/ScalarEvolution/pr28705.ll Thu Nov  8
>> +++ 03:54:35 2018
>> @@ -1,11 +1,11 @@
>>    ; PR28705
>>    ; RUN: opt < %s -indvars -S | FileCheck %s
>>    
>> -; Check IndVarSimplify replaces the exitval use of the induction var "%inc.i.i"
>> -; with "%.sroa.speculated + 1".
>> +; Check IndVarSimplify doesn't replace external use of the induction 
>> +var ; "%inc.i.i" with "%.sroa.speculated + 1" because it is not profitable.
>>    ;
>>    ; CHECK-LABEL: @foo(
>> -; CHECK: %[[EXIT:.+]] = sub i32 %.sroa.speculated, -1
>> +; CHECK: %[[EXIT:.+]] = phi i32 [ %inc.i.i, %for.body650 ]
>>    ; CHECK: %DB.sroa.9.0.lcssa = phi i32 [ 1, %entry ], [ %[[EXIT]], %loopexit ]
>>    ;
>>    define void @foo(i32 %sub.ptr.div.i, i8* %ref.i1174) 
>> local_unnamed_addr {
>>
>> Modified: llvm/trunk/test/Transforms/IndVarSimplify/dont-recompute.ll
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/IndVar
>> S implify/dont-recompute.ll?rev=346397&r1=346396&r2=346397&view=diff
>> =====================================================================
>> =
>> ========
>> --- llvm/trunk/test/Transforms/IndVarSimplify/dont-recompute.ll
>> (original)
>> +++ llvm/trunk/test/Transforms/IndVarSimplify/dont-recompute.ll Thu 
>> +++ Nov  8 03:54:35 2018
>> @@ -123,3 +123,54 @@ for.end:
>>      tail call void @func(i32 %soft_use)
>>      ret void
>>    }
>> +
>> +; CHECK-LABEL: @test5(
>> +define void @test5(i32 %m) nounwind uwtable {
>> +entry:
>> +  br label %for.body
>> +
>> +for.body:                                         ; preds = %for.body, %entry
>> +  %i.06 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
>> +  %a.05 = phi i32 [ 0, %entry ], [ %add, %for.body ]
>> +  %add = add i32 %a.05, %m
>> +  %soft_use = add i32 %add, 123
>> +; CHECK: tail call void @func(i32 %soft_use)
>> +  tail call void @func(i32 %soft_use)
>> +  %inc = add nsw i32 %i.06, 1
>> +  %exitcond = icmp eq i32 %inc, 186
>> +  br i1 %exitcond, label %for.end, label %for.body
>> +
>> +for.end:                                          ; preds = %for.body
>> +; CHECK: for.end:
>> +; CHECK-NOT: mul i32 %m, 186
>> +; CHECK:%add.lcssa = phi i32 [ %add, %for.body ] ; CHECK-NEXT: tail 
>> +call void @func(i32 %add.lcssa)
>> +  tail call void @func(i32 %add)
>> +  ret void
>> +}
>> +
>> +; CHECK-LABEL: @test6(
>> +define void @test6(i32 %m, i32* %p) nounwind uwtable {
>> +entry:
>> +  br label %for.body
>> +
>> +for.body:                                         ; preds = %for.body, %entry
>> +  %i.06 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
>> +  %a.05 = phi i32 [ 0, %entry ], [ %add, %for.body ]
>> +  %add = add i32 %a.05, %m
>> +  %soft_use = add i32 %add, 123
>> +; CHECK: store i32 %soft_use, i32* %pidx
>> +  %pidx = getelementptr i32, i32* %p, i32 %add
>> +  store i32 %soft_use, i32* %pidx
>> +  %inc = add nsw i32 %i.06, 1
>> +  %exitcond = icmp eq i32 %inc, 186
>> +  br i1 %exitcond, label %for.end, label %for.body
>> +
>> +for.end:                                          ; preds = %for.body
>> +; CHECK: for.end:
>> +; CHECK-NOT: mul i32 %m, 186
>> +; CHECK:%add.lcssa = phi i32 [ %add, %for.body ] ; CHECK-NEXT: tail 
>> +call void @func(i32 %add.lcssa)
>> +  tail call void @func(i32 %add)
>> +  ret void
>> +}
>>
>> Modified:
>> llvm/trunk/test/Transforms/IndVarSimplify/lrev-existing-umin.ll
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/IndVar
>> S 
>> implify/lrev-existing-umin.ll?rev=346397&r1=346396&r2=346397&view=dif
>> f 
>> =====================================================================
>> =
>> ========
>> --- llvm/trunk/test/Transforms/IndVarSimplify/lrev-existing-umin.ll
>> (original)
>> +++ llvm/trunk/test/Transforms/IndVarSimplify/lrev-existing-umin.ll
>> +++ Thu Nov  8 03:54:35 2018
>> @@ -1,5 +1,7 @@
>>    ; RUN: opt -S -indvars < %s | FileCheck %s
>>    
>> +; Do not rewrite the user outside the loop because we must keep the 
>> +instruction ; inside the loop due to store. Rewrite doesn't give us any profit.
>>    define void @f(i32 %length.i.88, i32 %length.i, i8* %tmp12, i32 %tmp10, i8* %tmp8) {
>>    ; CHECK-LABEL: @f(
>>    not_zero11.preheader:
>> @@ -21,6 +23,42 @@ not_zero11:
>>      %tmp22 = add nuw nsw i32 %v_1, 1
>>      %tmp23 = icmp slt i32 %tmp22, %tmp14
>>      br i1 %tmp23, label %not_zero11, label %main.exit.selector
>> +
>> +main.exit.selector:
>> +; CHECK-LABEL: main.exit.selector:
>> +; CHECK:   %tmp22.lcssa = phi i32 [ %tmp22, %not_zero11 ]
>> +; CHECK:   %tmp24 = icmp slt i32 %tmp22.lcssa, %length.
>> +  %tmp24 = icmp slt i32 %tmp22, %length.i
>> +  br i1 %tmp24, label %not_zero11.postloop, label %leave
>> +
>> +leave:
>> +  ret void
>> +
>> +not_zero11.postloop:
>> +  ret void
>> +}
>> +
>> +; Rewrite the user outside the loop because there is no hard users inside the loop.
>> +define void @f1(i32 %length.i.88, i32 %length.i, i8* %tmp12, i32 
>> +%tmp10, i8* %tmp8) { ; CHECK-LABEL: @f1(
>> +not_zero11.preheader:
>> +  %tmp13 = icmp ugt i32 %length.i, %length.i.88
>> +  %tmp14 = select i1 %tmp13, i32 %length.i.88, i32 %length.i
>> +  %tmp15 = icmp sgt i32 %tmp14, 0
>> +  br i1 %tmp15, label %not_zero11, label %not_zero11.postloop
>> +
>> +not_zero11:
>> +  %v_1 = phi i32 [ %tmp22, %not_zero11 ], [ 0, %not_zero11.preheader 
>> +]
>> +  %tmp16 = zext i32 %v_1 to i64
>> +  %tmp17 = getelementptr inbounds i8, i8* %tmp8, i64 %tmp16
>> +  %tmp18 = load i8, i8* %tmp17, align 1
>> +  %tmp19 = zext i8 %tmp18 to i32
>> +  %tmp20 = or i32 %tmp19, %tmp10
>> +  %tmp21 = trunc i32 %tmp20 to i8
>> +  %addr22 = getelementptr inbounds i8, i8* %tmp12, i64 %tmp16
>> +  %tmp22 = add nuw nsw i32 %v_1, 1
>> +  %tmp23 = icmp slt i32 %tmp22, %tmp14
>> +  br i1 %tmp23, label %not_zero11, label %main.exit.selector
>>    
>>    main.exit.selector:
>>    ; CHECK-LABEL: main.exit.selector:
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>



More information about the llvm-commits mailing list