[llvm] d4648ee - [SCEV] Use trip count information to improve shift recurrence ranges

Mon Mar 22 10:34:20 PDT 2021

That didn't work, but it did get a more readable error.  Try 2 on a fix 
in 93ce855

/home/ssglocal/clang-cmake-x86_64-avx2-linux/clang-cmake-x86_64-avx2-linux-perf/llvm/llvm/lib/Analysis/ScalarEvolution.cpp: In member function ‘llvm::ConstantRange llvm::ScalarEvolution::getRangeForUnknownRecurrence(const llvm::SCEVUnknown*)’:
/home/ssglocal/clang-cmake-x86_64-avx2-linux/clang-cmake-x86_64-avx2-linux-perf/llvm/llvm/lib/Analysis/ScalarEvolution.cpp:5702:35: error: call of overloaded ‘APInt(unsigned int&, unsigned int, bool)’ is ambiguous
    APInt TCAP(BitWidth, TC-1, false);
                                    ^
In file included from /home/ssglocal/clang-cmake-x86_64-avx2-linux/clang-cmake-x86_64-avx2-linux-perf/llvm/llvm/include/llvm/Analysis/ScalarEvolution.h:23:0,
                  from /home/ssglocal/clang-cmake-x86_64-avx2-linux/clang-cmake-x86_64-avx2-linux-perf/llvm/llvm/lib/Analysis/ScalarEvolution.cpp:60:
/home/ssglocal/clang-cmake-x86_64-avx2-linux/clang-cmake-x86_64-avx2-linux-perf/llvm/llvm/include/llvm/ADT/APInt.h:305:3: note: candidate: llvm::APInt::APInt(unsigned int, unsigned int, const uint64_t*)
    APInt(unsigned numBits, unsigned numWords, const uint64_t bigVal[]);
    ^
/home/ssglocal/clang-cmake-x86_64-avx2-linux/clang-cmake-x86_64-avx2-linux-perf/llvm/llvm/include/llvm/ADT/APInt.h:278:3: note: candidate: llvm::APInt::APInt(unsigned int, uint64_t, bool)
    APInt(unsigned numBits, uint64_t val, bool isSigned = false)
    ^

On 3/22/21 10:25 AM, Philip Reames via llvm-commits wrote:
> FYI, this made the windows builders unhappy.  I'm unclear what the 
> problem is.  I took a wild guess in 6ba73c47, but if that doesn't 
> work, I may need some help with someone knowledgeable on MSVC.
>
> Here's the warnings I'm seeing:
>
> C:\buildbot\lldb-x64-windows-ninja\llvm-project\llvm\lib\Analysis\ScalarEvolution.cpp(5703): 
> error C2440: '<function-style-cast>': cannot convert from 'initializer 
> list' to 'llvm::APInt'
> C:\buildbot\lldb-x64-windows-ninja\llvm-project\llvm\lib\Analysis\ScalarEvolution.cpp(5703): 
> note: No constructor could take the source type, or constructor 
> overload resolution was ambiguous
> C:\buildbot\lldb-x64-windows-ninja\llvm-project\llvm\lib\Analysis\ScalarEvolution.cpp(5709): 
> error C3536: 'TotalShift': cannot be used before it is initialized
> C:\buildbot\lldb-x64-windows-ninja\llvm-project\llvm\lib\Analysis\ScalarEvolution.cpp(5709): 
> error C2664: 'llvm::KnownBits llvm::KnownBits::makeConstant(const 
> llvm::APInt &)': cannot convert argument 1 from 'int' to 'const 
> llvm::APInt &'
> C:\buildbot\lldb-x64-windows-ninja\llvm-project\llvm\lib\Analysis\ScalarEvolution.cpp(5709): 
> note: Reason: cannot convert from 'int' to 'const llvm::APInt'
> C:\buildbot\lldb-x64-windows-ninja\llvm-project\llvm\lib\Analysis\ScalarEvolution.cpp(5709): 
> note: No constructor could take the source type, or constructor 
> overload resolution was ambiguous
> C:\buildbot\lldb-x64-windows-ninja\llvm-project\llvm\lib\Analysis\ScalarEvolution.cpp(5708): 
> error C2660: 'llvm::KnownBits::shl': function does not take 1 arguments
> C:\buildbot\lldb-x64-windows-ninja\llvm-project\llvm\include\llvm/Support/KnownBits.h(328): 
> note: see declaration of 'llvm::KnownBits::shl'
> C:\buildbot\lldb-x64-windows-ninja\llvm-project\llvm\lib\Analysis\ScalarEvolution.cpp(5712): 
> error C3536: 'KnownEnd': cannot be used before it is initialized
>
> On 3/22/21 9:39 AM, Philip Reames via llvm-commits wrote:
>> Author: Philip Reames
>> Date: 2021-03-22T09:38:43-07:00
>> New Revision: d4648eeaa270fe787d8158596a0f58e0426ed498
>>
>> URL: 
>> https://github.com/llvm/llvm-project/commit/d4648eeaa270fe787d8158596a0f58e0426ed498
>> DIFF: 
>> https://github.com/llvm/llvm-project/commit/d4648eeaa270fe787d8158596a0f58e0426ed498.diff
>>
>> LOG: [SCEV] Use trip count information to improve shift recurrence 
>> ranges
>>
>> This patch exploits the knowledge that we may be running many fewer 
>> than bitwidth iterations of the loop, and may be able to disallow the 
>> overflow case. This patch specifically implements only the shl case, 
>> but this can be generalized to ashr and lshr without difficulty.
>>
>> Differential Revision: https://reviews.llvm.org/D98222
>>
>> Added:
>>
>> Modified:
>>      llvm/include/llvm/Analysis/ScalarEvolution.h
>>      llvm/lib/Analysis/ScalarEvolution.cpp
>>      llvm/test/Analysis/ScalarEvolution/shift-recurrences.ll
>>
>> Removed:
>>
>>
>> ################################################################################ 
>>
>> diff  --git a/llvm/include/llvm/Analysis/ScalarEvolution.h 
>> b/llvm/include/llvm/Analysis/ScalarEvolution.h
>> index 206e502673a9..8407be99a82b 100644
>> --- a/llvm/include/llvm/Analysis/ScalarEvolution.h
>> +++ b/llvm/include/llvm/Analysis/ScalarEvolution.h
>> @@ -1574,6 +1574,12 @@ class ScalarEvolution {
>>     ConstantRange getRangeViaFactoring(const SCEV *Start, const SCEV 
>> *Stop,
>>                                        const SCEV *MaxBECount, 
>> unsigned BitWidth);
>>   +  /// If the unknown expression U corresponds to a simple 
>> recurrence, return
>> +  /// a constant range which represents the entire recurrence. Note 
>> that
>> +  /// *add* recurrences with loop invariant steps aren't represented by
>> +  /// SCEVUnknowns and thus don't use this mechanism.
>> +  ConstantRange getRangeForUnknownRecurrence(const SCEVUnknown *U);
>> +
>>     /// We know that there is no SCEV for the specified value. 
>> Analyze the
>>     /// expression.
>>     const SCEV *createSCEV(Value *V);
>>
>> diff  --git a/llvm/lib/Analysis/ScalarEvolution.cpp 
>> b/llvm/lib/Analysis/ScalarEvolution.cpp
>> index f12ebe3a8727..92294dc3e6c1 100644
>> --- a/llvm/lib/Analysis/ScalarEvolution.cpp
>> +++ b/llvm/lib/Analysis/ScalarEvolution.cpp
>> @@ -5645,6 +5645,76 @@ void 
>> ScalarEvolution::setNoWrapFlags(SCEVAddRecExpr *AddRec,
>>     }
>>   }
>>   +ConstantRange ScalarEvolution::
>> +getRangeForUnknownRecurrence(const SCEVUnknown *U) {
>> +  const DataLayout &DL = getDataLayout();
>> +
>> +  unsigned BitWidth = getTypeSizeInBits(U->getType());
>> +  ConstantRange CR(BitWidth, /*isFullSet=*/true);
>> +
>> +  // Match a simple recurrence of the form: <start, ShiftOp, Step>, 
>> and then
>> +  // use information about the trip count to improve our available 
>> range.  Note
>> +  // that the trip count independent cases are already handled by 
>> known bits.
>> +  // WARNING: The definition of recurrence used here is subtly
>> diff erent than
>> +  // the one used by AddRec (and thus most of this file).  Step is 
>> allowed to
>> +  // be arbitrarily loop varying here, where AddRec allows only loop 
>> invariant
>> +  // and other addrecs in the same loop (for non-affine addrecs).  
>> The code
>> +  // below intentionally handles the case where step is not loop 
>> invariant.
>> +  auto *P = dyn_cast<PHINode>(U->getValue());
>> +  if (!P)
>> +    return CR;
>> +
>> +  BinaryOperator *BO;
>> +  Value *Start, *Step;
>> +  if (!matchSimpleRecurrence(P, BO, Start, Step))
>> +    return CR;
>> +
>> +  // If we found a recurrence, we must be in a loop -- unless we're
>> +  // in unreachable code where dominance collapses.  Note that BO might
>> +  // be in some subloop of L, and that's completely okay.
>> +  auto *L = LI.getLoopFor(P->getParent());
>> +  if (!L)
>> +    return CR;
>> +  assert(L->getHeader() == P->getParent());
>> +  if (!L->contains(BO->getParent()))
>> +    // NOTE: This bailout should be an assert instead. However, 
>> asserting
>> +    // the condition here exposes a case where LoopFusion is 
>> querying SCEV
>> +    // with malformed loop information during the midst of the 
>> transform.
>> +    // There doesn't appear to be an obvious fix, so for the moment 
>> bailout
>> +    // until the caller issue can be fixed.  PR49566 tracks the bug.
>> +    return CR;
>> +
>> +  // TODO: Handle ashr and lshr cases to increase minimum value 
>> reported
>> +  if (BO->getOpcode() != Instruction::Shl || BO->getOperand(0) != P)
>> +    return CR;
>> +
>> +  unsigned TC = getSmallConstantMaxTripCount(L);
>> +  if (!TC || TC >= BitWidth)
>> +    return CR;
>> +
>> +  auto KnownStart = computeKnownBits(Start, DL, 0, &AC, nullptr, &DT);
>> +  auto KnownStep = computeKnownBits(Step, DL, 0, &AC, nullptr, &DT);
>> +  assert(KnownStart.getBitWidth() == BitWidth &&
>> +         KnownStep.getBitWidth() == BitWidth);
>> +
>> +  // Compute total shift amount, being careful of overflow and 
>> bitwidths.
>> +  auto MaxShiftAmt = KnownStep.getMaxValue();
>> +  bool Overflow = false;
>> +  auto TotalShift = MaxShiftAmt.umul_ov(APInt(BitWidth, TC-1, 
>> false), Overflow);
>> +  if (Overflow)
>> +    return CR;
>> +
>> +  // Iff no bits are shifted out, value increases on every shift.
>> +  auto KnownEnd = KnownBits::shl(KnownStart,
>> + KnownBits::makeConstant(TotalShift));
>> +  if (TotalShift.ult(KnownStart.countMinLeadingZeros()))
>> +    CR = CR.intersectWith(ConstantRange(KnownStart.getMinValue(),
>> +                                        KnownEnd.getMaxValue() + 1));
>> +  return CR;
>> +}
>> +
>> +
>> +
>>   /// Determine the range for a particular SCEV.  If SignHint is
>>   /// HINT_RANGE_UNSIGNED (resp. HINT_RANGE_SIGNED) then getRange 
>> prefers ranges
>>   /// with a "cleaner" unsigned (resp. signed) representation.
>> @@ -5845,12 +5915,19 @@ ScalarEvolution::getRangeRef(const SCEV *S,
>>     }
>>       if (const SCEVUnknown *U = dyn_cast<SCEVUnknown>(S)) {
>> +
>>       // Check if the IR explicitly contains !range metadata.
>>       Optional<ConstantRange> MDRange = 
>> GetRangeFromMetadata(U->getValue());
>>       if (MDRange.hasValue())
>>         ConservativeResult = 
>> ConservativeResult.intersectWith(MDRange.getValue(),
>> RangeType);
>>   +    // Use facts about recurrences in the underlying IR. Note that 
>> add
>> +    // recurrences are AddRecExprs and thus don't hit this path.  This
>> +    // primarily handles shift recurrences.
>> +    auto CR = getRangeForUnknownRecurrence(U);
>> +    ConservativeResult = ConservativeResult.intersectWith(CR);
>> +
>>       // See if ValueTracking can give us a useful range.
>>       const DataLayout &DL = getDataLayout();
>>       KnownBits Known = computeKnownBits(U->getValue(), DL, 0, &AC, 
>> nullptr, &DT);
>>
>> diff  --git a/llvm/test/Analysis/ScalarEvolution/shift-recurrences.ll 
>> b/llvm/test/Analysis/ScalarEvolution/shift-recurrences.ll
>> index 4eb9c9ea50a6..5f34ad42ab5f 100644
>> --- a/llvm/test/Analysis/ScalarEvolution/shift-recurrences.ll
>> +++ b/llvm/test/Analysis/ScalarEvolution/shift-recurrences.ll
>> @@ -193,11 +193,11 @@ define void @test_shl2() {
>>   ; CHECK-NEXT:    %iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
>>   ; CHECK-NEXT:    --> {0,+,1}<%loop> U: [0,5) S: [0,5) Exits: 4 
>> LoopDispositions: { %loop: Computable }
>>   ; CHECK-NEXT:    %iv.shl = phi i64 [ 4, %entry ], [ %iv.shl.next, 
>> %loop ]
>> -; CHECK-NEXT:    --> %iv.shl U: [0,-3) S: 
>> [-9223372036854775808,9223372036854775801) Exits: 64 
>> LoopDispositions: { %loop: Variant }
>> +; CHECK-NEXT:    --> %iv.shl U: [4,65) S: [4,65) Exits: 64 
>> LoopDispositions: { %loop: Variant }
>>   ; CHECK-NEXT:    %iv.next = add i64 %iv, 1
>>   ; CHECK-NEXT:    --> {1,+,1}<%loop> U: [1,6) S: [1,6) Exits: 5 
>> LoopDispositions: { %loop: Computable }
>>   ; CHECK-NEXT:    %iv.shl.next = shl i64 %iv.shl, 1
>> -; CHECK-NEXT:    --> (2 * %iv.shl) U: [0,-7) S: 
>> [-9223372036854775808,9223372036854775801) Exits: 128 
>> LoopDispositions: { %loop: Variant }
>> +; CHECK-NEXT:    --> (2 * %iv.shl)<nuw><nsw> U: [8,129) S: [8,129) 
>> Exits: 128 LoopDispositions: { %loop: Variant }
>>   ; CHECK-NEXT:  Determining loop execution counts for: @test_shl2
>>   ; CHECK-NEXT:  Loop %loop: backedge-taken count is 4
>>   ; CHECK-NEXT:  Loop %loop: max backedge-taken count is 4
>> @@ -227,7 +227,7 @@ define void @test_shl3(i1 %c) {
>>   ; CHECK-NEXT:    %iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
>>   ; CHECK-NEXT:    --> {0,+,1}<%loop> U: [0,5) S: [0,5) Exits: 4 
>> LoopDispositions: { %loop: Computable }
>>   ; CHECK-NEXT:    %iv.shl = phi i64 [ 4, %entry ], [ %iv.shl.next, 
>> %loop ]
>> -; CHECK-NEXT:    --> %iv.shl U: [0,-3) S: 
>> [-9223372036854775808,9223372036854775805) Exits: <<Unknown>> 
>> LoopDispositions: { %loop: Variant }
>> +; CHECK-NEXT:    --> %iv.shl U: [4,65) S: [4,65) Exits: <<Unknown>> 
>> LoopDispositions: { %loop: Variant }
>>   ; CHECK-NEXT:    %iv.next = add i64 %iv, 1
>>   ; CHECK-NEXT:    --> {1,+,1}<%loop> U: [1,6) S: [1,6) Exits: 5 
>> LoopDispositions: { %loop: Computable }
>>   ; CHECK-NEXT:    %iv.shl.next = shl i64 %iv.shl, %shiftamt
>> @@ -260,11 +260,11 @@ define void @test_shl4() {
>>   ; CHECK-NEXT:    %iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
>>   ; CHECK-NEXT:    --> {0,+,1}<%loop> U: [0,61) S: [0,61) Exits: 60 
>> LoopDispositions: { %loop: Computable }
>>   ; CHECK-NEXT:    %iv.shl = phi i64 [ 4, %entry ], [ %iv.shl.next, 
>> %loop ]
>> -; CHECK-NEXT:    --> %iv.shl U: [0,-3) S: 
>> [-9223372036854775808,9223372036854775801) Exits: 4611686018427387904 
>> LoopDispositions: { %loop: Variant }
>> +; CHECK-NEXT:    --> %iv.shl U: [4,4611686018427387905) S: 
>> [4,4611686018427387905) Exits: 4611686018427387904 LoopDispositions: 
>> { %loop: Variant }
>>   ; CHECK-NEXT:    %iv.next = add i64 %iv, 1
>>   ; CHECK-NEXT:    --> {1,+,1}<%loop> U: [1,62) S: [1,62) Exits: 61 
>> LoopDispositions: { %loop: Computable }
>>   ; CHECK-NEXT:    %iv.shl.next = shl i64 %iv.shl, 1
>> -; CHECK-NEXT:    --> (2 * %iv.shl) U: [0,-7) S: 
>> [-9223372036854775808,9223372036854775801) Exits: 
>> -9223372036854775808 LoopDispositions: { %loop: Variant }
>> +; CHECK-NEXT:    --> (2 * %iv.shl)<nuw> U: [8,-9223372036854775807) 
>> S: [-9223372036854775808,9223372036854775801) Exits: 
>> -9223372036854775808 LoopDispositions: { %loop: Variant }
>>   ; CHECK-NEXT:  Determining loop execution counts for: @test_shl4
>>   ; CHECK-NEXT:  Loop %loop: backedge-taken count is 60
>>   ; CHECK-NEXT:  Loop %loop: max backedge-taken count is 60
>> @@ -324,7 +324,7 @@ define void @test_shl6(i1 %c) {
>>   ; CHECK-NEXT:    %iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
>>   ; CHECK-NEXT:    --> {0,+,1}<%loop> U: [0,5) S: [0,5) Exits: 4 
>> LoopDispositions: { %loop: Computable }
>>   ; CHECK-NEXT:    %iv.shl = phi i64 [ 4, %entry ], [ %iv.shl.next, 
>> %loop ]
>> -; CHECK-NEXT:    --> %iv.shl U: [0,-3) S: 
>> [-9223372036854775808,9223372036854775805) Exits: 16 
>> LoopDispositions: { %loop: Variant }
>> +; CHECK-NEXT:    --> %iv.shl U: [4,65) S: [4,65) Exits: 16 
>> LoopDispositions: { %loop: Variant }
>>   ; CHECK-NEXT:    %iv.next = add i64 %iv, 1
>>   ; CHECK-NEXT:    --> {1,+,1}<%loop> U: [1,6) S: [1,6) Exits: 5 
>> LoopDispositions: { %loop: Computable }
>>   ; CHECK-NEXT:    %shiftamt = and i64 %iv, 1
>>
>>
>>          _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits