[llvm] r300746 - [SCEV] Make SCEV or modeling more aggressive.

Thu Apr 20 18:37:36 PDT 2017

> On Apr 20, 2017, at 4:50 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:
> 
> I can't immediately think of any silver bullets here unfortunately.  I
> do want to take a look to see if some of the regression can be
> recovered by making the add expr handling faster in the pattern seen
> in this benchmark, but that's pretty uncertain.

Hm, the increase in SCEV time must be much bigger than 3X (since 3X is the increase in total time). Would it be possible limit the #expressions with an extra compile time option which getSCEV honors? By default the “old” code is invoked, but when the option is set, more (accurate) expressions are computed. Eli’s code is for one case - Or, but it seems similar patches could be applied for other logical expressions like And, Xor etc, and then many more programs could/will be hit by significant compile-time regressions.

Thanks
Gerolf

> 
> Were you able to figure out why we had a code size regression?
> 
> -- Sanjoy
> 
> On Thu, Apr 20, 2017 at 4:34 PM, Friedman, Eli <efriedma at codeaurora.org> wrote:
>> On 4/20/2017 3:10 PM, Mikhail Zolotukhin wrote:
>> 
>> 
>> On Apr 20, 2017, at 2:39 PM, Friedman, Eli <efriedma at codeaurora.org> wrote:
>> 
>> The testcase is derived from a loop we were looking at while doing some
>> polly-related work; not sure if it has any visible performance benefit
>> elsewhere.
>> 
>> That said, I wasn't expecting a compile-time regression, and the codesize
>> regression is definitely weird.  Can you get a preprocessed source file for
>> me to look at?
>> 
>> Here it is:
>> 
>> 
>> 
>> It's one of source files of ClamAV, which is a part of LLVM test-suite.
>> Compile time on this file increased almost 3x in my preliminary measurements
>> (and looking at the source, function "body" has a lot of ORs, so probably
>> it's the culprit).
>> 
>> 
>> Yes, that's the loop with the problem.
>> 
>> The loop has a bunch of repeated operations... and basically every operation
>> in the loop is an addition involving a previous operation in the loop.
>> Before, we would treat the rotate as an opaque operation, which prevented
>> the size of the expression from exploding, but this patch allows SCEV to
>> model a rotate expression as "(x * pow(2, 7)) + (x / pow(2,(32-7)))", so we
>> build complicated SCEV expressions for every instruction in the loop.  I'm
>> seeing call stacks with hundreds of calls to getSCEV.
>> 
>> We aren't spending time in haveNoCommonBitsSet itself.
>> 
>> I'm not sure what the right solution is to limit the amount of time we spend
>> in SCEV here.  Sanjoy, any ideas?
>> 
>> -Eli
>> 
>> 
>> Michael
>> 
>> 
>> -Eli
>> 
>> On 4/20/2017 1:02 PM, Mikhail Zolotukhin wrote:
>> 
>> Hi Eli,
>> 
>> It looks like this change caused a big compile time and codesize regression
>> on clamscan [1]. Is it expected, and if so, are there any gains from this
>> change that overweigh the losses? I realize that the change itself looks
>> right useful, but if we get nothing from it then probably we should find a
>> way to mitigate negative effects first.
>> 
>> Thanks,
>> Michael
>> 
>> [1] http://104.154.54.203/db_default/v4/nts/44526
>> 
>> 
>> On Apr 19, 2017, at 1:19 PM, Eli Friedman via llvm-commits
>> <llvm-commits at lists.llvm.org> wrote:
>> 
>> Author: efriedma
>> Date: Wed Apr 19 15:19:58 2017
>> New Revision: 300746
>> 
>> URL: http://llvm.org/viewvc/llvm-project?rev=300746&view=rev
>> Log:
>> [SCEV] Make SCEV or modeling more aggressive.
>> 
>> Use haveNoCommonBitsSet to figure out whether an "or" instruction
>> is equivalent to addition. This handles more cases than just
>> checking for a constant on the RHS.
>> 
>> Differential Revision: https://reviews.llvm.org/D32239
>> 
>> 
>> Added:
>>   llvm/trunk/test/Analysis/ScalarEvolution/or-as-add.ll
>> Modified:
>>   llvm/trunk/lib/Analysis/ScalarEvolution.cpp
>> 
>> Modified: llvm/trunk/lib/Analysis/ScalarEvolution.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ScalarEvolution.cpp?rev=300746&r1=300745&r2=300746&view=diff
>> ==============================================================================
>> --- llvm/trunk/lib/Analysis/ScalarEvolution.cpp (original)
>> +++ llvm/trunk/lib/Analysis/ScalarEvolution.cpp Wed Apr 19 15:19:58 2017
>> @@ -5328,28 +5328,12 @@ const SCEV *ScalarEvolution::createSCEV(
>>      break;
>> 
>>    case Instruction::Or:
>> -      // If the RHS of the Or is a constant, we may have something like:
>> -      // X*4+1 which got turned into X*4|1.  Handle this as an Add so loop
>> -      // optimizations will transparently handle this case.
>> -      //
>> -      // In order for this transformation to be safe, the LHS must be of
>> the
>> -      // form X*(2^n) and the Or constant must be less than 2^n.
>> -      if (ConstantInt *CI = dyn_cast<ConstantInt>(BO->RHS)) {
>> -        const SCEV *LHS = getSCEV(BO->LHS);
>> -        const APInt &CIVal = CI->getValue();
>> -        if (GetMinTrailingZeros(LHS) >=
>> -            (CIVal.getBitWidth() - CIVal.countLeadingZeros())) {
>> -          // Build a plain add SCEV.
>> -          const SCEV *S = getAddExpr(LHS, getSCEV(CI));
>> -          // If the LHS of the add was an addrec and it has no-wrap flags,
>> -          // transfer the no-wrap flags, since an or won't introduce a
>> wrap.
>> -          if (const SCEVAddRecExpr *NewAR = dyn_cast<SCEVAddRecExpr>(S)) {
>> -            const SCEVAddRecExpr *OldAR = cast<SCEVAddRecExpr>(LHS);
>> -            const_cast<SCEVAddRecExpr *>(NewAR)->setNoWrapFlags(
>> -                OldAR->getNoWrapFlags());
>> -          }
>> -          return S;
>> -        }
>> +      // Use ValueTracking to check whether this is actually an add.
>> +      if (haveNoCommonBitsSet(BO->LHS, BO->RHS, getDataLayout(), &AC,
>> +                              nullptr, &DT)) {
>> +        // There aren't any common bits set, so the add can't wrap.
>> +        auto Flags = SCEV::NoWrapFlags(SCEV::FlagNUW | SCEV::FlagNSW);
>> +        return getAddExpr(getSCEV(BO->LHS), getSCEV(BO->RHS), Flags);
>>      }
>>      break;
>> 
>> 
>> Added: llvm/trunk/test/Analysis/ScalarEvolution/or-as-add.ll
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/ScalarEvolution/or-as-add.ll?rev=300746&view=auto
>> ==============================================================================
>> --- llvm/trunk/test/Analysis/ScalarEvolution/or-as-add.ll (added)
>> +++ llvm/trunk/test/Analysis/ScalarEvolution/or-as-add.ll Wed Apr 19
>> 15:19:58 2017
>> @@ -0,0 +1,38 @@
>> +; RUN: opt < %s -analyze -scalar-evolution | FileCheck %s
>> +
>> +declare void @z(i32)
>> +declare void @z2(i64)
>> +
>> +define void @fun(i1 %bool, i32 %x) {
>> +entry:
>> +        br label %body
>> +body:
>> +        %i = phi i32 [ 0, %entry ], [ %i.next, %body ]
>> +        %bottom_zero = mul i32 %i, 2
>> +        %a = or i32 %bottom_zero, 1
>> +        call void @z(i32 %a)
>> +        %bool_ext = zext i1 %bool to i32
>> +        %b = or i32 %bool_ext, %bottom_zero
>> +        call void @z(i32 %b)
>> +        %shifted = lshr i32 %x, 31
>> +        %c = or i32 %shifted, %bottom_zero
>> +        call void @z(i32 %c)
>> +        %i_ext = zext i32 %i to i64
>> +        %d = or i64 %i_ext, 4294967296
>> +        call void @z2(i64 %d)
>> +        %i.next = add i32 %i, 1
>> +        %cond = icmp eq i32 %i.next, 10
>> +        br i1 %cond, label %exit, label %body
>> +exit:
>> +        ret void
>> +}
>> +
>> +; CHECK: %a = or i32 %bottom_zero, 1
>> +; CHECK-NEXT: -->  {1,+,2}<%body>
>> +; CHECK: %b = or i32 %bool_ext, %bottom_zero
>> +; CHECK-NEXT: -->  {(zext i1 %bool to i32),+,2}
>> +; CHECK: %c = or i32 %shifted, %bottom_zero
>> +; CHECK-NEXT: -->  {(%x /u -2147483648),+,2}<%body>
>> +; CHECK: %d = or i64 %i_ext, 4294967296
>> +; CHECK-NEXT: -->  {4294967296,+,1}<nuw><nsw><%body>
>> +
>> 
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>> 
>> 
>> 
>> --
>> Employee of Qualcomm Innovation Center, Inc.
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
>> Foundation Collaborative Project
>> 
>> 
>> 
>> --
>> Employee of Qualcomm Innovation Center, Inc.
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
>> Foundation Collaborative Project