[llvm] d4648ee - [SCEV] Use trip count information to improve shift recurrence ranges

Mon Mar 22 11:12:20 PDT 2021

2nd attempt appears to have cleared the bots.

Philip

On 3/22/21 10:34 AM, Philip Reames via llvm-commits wrote:
> That didn't work, but it did get a more readable error.  Try 2 on a 
> fix in 93ce855
>
> /home/ssglocal/clang-cmake-x86_64-avx2-linux/clang-cmake-x86_64-avx2-linux-perf/llvm/llvm/lib/Analysis/ScalarEvolution.cpp: 
> In member function ‘llvm::ConstantRange 
> llvm::ScalarEvolution::getRangeForUnknownRecurrence(const 
> llvm::SCEVUnknown*)’:
> /home/ssglocal/clang-cmake-x86_64-avx2-linux/clang-cmake-x86_64-avx2-linux-perf/llvm/llvm/lib/Analysis/ScalarEvolution.cpp:5702:35: 
> error: call of overloaded ‘APInt(unsigned int&, unsigned int, bool)’ 
> is ambiguous
>    APInt TCAP(BitWidth, TC-1, false);
>                                    ^
> In file included from 
> /home/ssglocal/clang-cmake-x86_64-avx2-linux/clang-cmake-x86_64-avx2-linux-perf/llvm/llvm/include/llvm/Analysis/ScalarEvolution.h:23:0,
>                  from 
> /home/ssglocal/clang-cmake-x86_64-avx2-linux/clang-cmake-x86_64-avx2-linux-perf/llvm/llvm/lib/Analysis/ScalarEvolution.cpp:60:
> /home/ssglocal/clang-cmake-x86_64-avx2-linux/clang-cmake-x86_64-avx2-linux-perf/llvm/llvm/include/llvm/ADT/APInt.h:305:3: 
> note: candidate: llvm::APInt::APInt(unsigned int, unsigned int, const 
> uint64_t*)
>    APInt(unsigned numBits, unsigned numWords, const uint64_t bigVal[]);
>    ^
> /home/ssglocal/clang-cmake-x86_64-avx2-linux/clang-cmake-x86_64-avx2-linux-perf/llvm/llvm/include/llvm/ADT/APInt.h:278:3: 
> note: candidate: llvm::APInt::APInt(unsigned int, uint64_t, bool)
>    APInt(unsigned numBits, uint64_t val, bool isSigned = false)
>    ^
>
> On 3/22/21 10:25 AM, Philip Reames via llvm-commits wrote:
>> FYI, this made the windows builders unhappy.  I'm unclear what the 
>> problem is.  I took a wild guess in 6ba73c47, but if that doesn't 
>> work, I may need some help with someone knowledgeable on MSVC.
>>
>> Here's the warnings I'm seeing:
>>
>> C:\buildbot\lldb-x64-windows-ninja\llvm-project\llvm\lib\Analysis\ScalarEvolution.cpp(5703): 
>> error C2440: '<function-style-cast>': cannot convert from 
>> 'initializer list' to 'llvm::APInt'
>> C:\buildbot\lldb-x64-windows-ninja\llvm-project\llvm\lib\Analysis\ScalarEvolution.cpp(5703): 
>> note: No constructor could take the source type, or constructor 
>> overload resolution was ambiguous
>> C:\buildbot\lldb-x64-windows-ninja\llvm-project\llvm\lib\Analysis\ScalarEvolution.cpp(5709): 
>> error C3536: 'TotalShift': cannot be used before it is initialized
>> C:\buildbot\lldb-x64-windows-ninja\llvm-project\llvm\lib\Analysis\ScalarEvolution.cpp(5709): 
>> error C2664: 'llvm::KnownBits llvm::KnownBits::makeConstant(const 
>> llvm::APInt &)': cannot convert argument 1 from 'int' to 'const 
>> llvm::APInt &'
>> C:\buildbot\lldb-x64-windows-ninja\llvm-project\llvm\lib\Analysis\ScalarEvolution.cpp(5709): 
>> note: Reason: cannot convert from 'int' to 'const llvm::APInt'
>> C:\buildbot\lldb-x64-windows-ninja\llvm-project\llvm\lib\Analysis\ScalarEvolution.cpp(5709): 
>> note: No constructor could take the source type, or constructor 
>> overload resolution was ambiguous
>> C:\buildbot\lldb-x64-windows-ninja\llvm-project\llvm\lib\Analysis\ScalarEvolution.cpp(5708): 
>> error C2660: 'llvm::KnownBits::shl': function does not take 1 arguments
>> C:\buildbot\lldb-x64-windows-ninja\llvm-project\llvm\include\llvm/Support/KnownBits.h(328): 
>> note: see declaration of 'llvm::KnownBits::shl'
>> C:\buildbot\lldb-x64-windows-ninja\llvm-project\llvm\lib\Analysis\ScalarEvolution.cpp(5712): 
>> error C3536: 'KnownEnd': cannot be used before it is initialized
>>
>> On 3/22/21 9:39 AM, Philip Reames via llvm-commits wrote:
>>> Author: Philip Reames
>>> Date: 2021-03-22T09:38:43-07:00
>>> New Revision: d4648eeaa270fe787d8158596a0f58e0426ed498
>>>
>>> URL: 
>>> https://github.com/llvm/llvm-project/commit/d4648eeaa270fe787d8158596a0f58e0426ed498
>>> DIFF: 
>>> https://github.com/llvm/llvm-project/commit/d4648eeaa270fe787d8158596a0f58e0426ed498.diff
>>>
>>> LOG: [SCEV] Use trip count information to improve shift recurrence 
>>> ranges
>>>
>>> This patch exploits the knowledge that we may be running many fewer 
>>> than bitwidth iterations of the loop, and may be able to disallow 
>>> the overflow case. This patch specifically implements only the shl 
>>> case, but this can be generalized to ashr and lshr without difficulty.
>>>
>>> Differential Revision: https://reviews.llvm.org/D98222
>>>
>>> Added:
>>>
>>> Modified:
>>>      llvm/include/llvm/Analysis/ScalarEvolution.h
>>>      llvm/lib/Analysis/ScalarEvolution.cpp
>>>      llvm/test/Analysis/ScalarEvolution/shift-recurrences.ll
>>>
>>> Removed:
>>>
>>>
>>> ################################################################################ 
>>>
>>> diff  --git a/llvm/include/llvm/Analysis/ScalarEvolution.h 
>>> b/llvm/include/llvm/Analysis/ScalarEvolution.h
>>> index 206e502673a9..8407be99a82b 100644
>>> --- a/llvm/include/llvm/Analysis/ScalarEvolution.h
>>> +++ b/llvm/include/llvm/Analysis/ScalarEvolution.h
>>> @@ -1574,6 +1574,12 @@ class ScalarEvolution {
>>>     ConstantRange getRangeViaFactoring(const SCEV *Start, const SCEV 
>>> *Stop,
>>>                                        const SCEV *MaxBECount, 
>>> unsigned BitWidth);
>>>   +  /// If the unknown expression U corresponds to a simple 
>>> recurrence, return
>>> +  /// a constant range which represents the entire recurrence. Note 
>>> that
>>> +  /// *add* recurrences with loop invariant steps aren't 
>>> represented by
>>> +  /// SCEVUnknowns and thus don't use this mechanism.
>>> +  ConstantRange getRangeForUnknownRecurrence(const SCEVUnknown *U);
>>> +
>>>     /// We know that there is no SCEV for the specified value. 
>>> Analyze the
>>>     /// expression.
>>>     const SCEV *createSCEV(Value *V);
>>>
>>> diff  --git a/llvm/lib/Analysis/ScalarEvolution.cpp 
>>> b/llvm/lib/Analysis/ScalarEvolution.cpp
>>> index f12ebe3a8727..92294dc3e6c1 100644
>>> --- a/llvm/lib/Analysis/ScalarEvolution.cpp
>>> +++ b/llvm/lib/Analysis/ScalarEvolution.cpp
>>> @@ -5645,6 +5645,76 @@ void 
>>> ScalarEvolution::setNoWrapFlags(SCEVAddRecExpr *AddRec,
>>>     }
>>>   }
>>>   +ConstantRange ScalarEvolution::
>>> +getRangeForUnknownRecurrence(const SCEVUnknown *U) {
>>> +  const DataLayout &DL = getDataLayout();
>>> +
>>> +  unsigned BitWidth = getTypeSizeInBits(U->getType());
>>> +  ConstantRange CR(BitWidth, /*isFullSet=*/true);
>>> +
>>> +  // Match a simple recurrence of the form: <start, ShiftOp, Step>, 
>>> and then
>>> +  // use information about the trip count to improve our available 
>>> range.  Note
>>> +  // that the trip count independent cases are already handled by 
>>> known bits.
>>> +  // WARNING: The definition of recurrence used here is subtly
>>> diff erent than
>>> +  // the one used by AddRec (and thus most of this file). Step is 
>>> allowed to
>>> +  // be arbitrarily loop varying here, where AddRec allows only 
>>> loop invariant
>>> +  // and other addrecs in the same loop (for non-affine addrecs).  
>>> The code
>>> +  // below intentionally handles the case where step is not loop 
>>> invariant.
>>> +  auto *P = dyn_cast<PHINode>(U->getValue());
>>> +  if (!P)
>>> +    return CR;
>>> +
>>> +  BinaryOperator *BO;
>>> +  Value *Start, *Step;
>>> +  if (!matchSimpleRecurrence(P, BO, Start, Step))
>>> +    return CR;
>>> +
>>> +  // If we found a recurrence, we must be in a loop -- unless we're
>>> +  // in unreachable code where dominance collapses.  Note that BO 
>>> might
>>> +  // be in some subloop of L, and that's completely okay.
>>> +  auto *L = LI.getLoopFor(P->getParent());
>>> +  if (!L)
>>> +    return CR;
>>> +  assert(L->getHeader() == P->getParent());
>>> +  if (!L->contains(BO->getParent()))
>>> +    // NOTE: This bailout should be an assert instead. However, 
>>> asserting
>>> +    // the condition here exposes a case where LoopFusion is 
>>> querying SCEV
>>> +    // with malformed loop information during the midst of the 
>>> transform.
>>> +    // There doesn't appear to be an obvious fix, so for the moment 
>>> bailout
>>> +    // until the caller issue can be fixed.  PR49566 tracks the bug.
>>> +    return CR;
>>> +
>>> +  // TODO: Handle ashr and lshr cases to increase minimum value 
>>> reported
>>> +  if (BO->getOpcode() != Instruction::Shl || BO->getOperand(0) != P)
>>> +    return CR;
>>> +
>>> +  unsigned TC = getSmallConstantMaxTripCount(L);
>>> +  if (!TC || TC >= BitWidth)
>>> +    return CR;
>>> +
>>> +  auto KnownStart = computeKnownBits(Start, DL, 0, &AC, nullptr, &DT);
>>> +  auto KnownStep = computeKnownBits(Step, DL, 0, &AC, nullptr, &DT);
>>> +  assert(KnownStart.getBitWidth() == BitWidth &&
>>> +         KnownStep.getBitWidth() == BitWidth);
>>> +
>>> +  // Compute total shift amount, being careful of overflow and 
>>> bitwidths.
>>> +  auto MaxShiftAmt = KnownStep.getMaxValue();
>>> +  bool Overflow = false;
>>> +  auto TotalShift = MaxShiftAmt.umul_ov(APInt(BitWidth, TC-1, 
>>> false), Overflow);
>>> +  if (Overflow)
>>> +    return CR;
>>> +
>>> +  // Iff no bits are shifted out, value increases on every shift.
>>> +  auto KnownEnd = KnownBits::shl(KnownStart,
>>> + KnownBits::makeConstant(TotalShift));
>>> +  if (TotalShift.ult(KnownStart.countMinLeadingZeros()))
>>> +    CR = CR.intersectWith(ConstantRange(KnownStart.getMinValue(),
>>> + KnownEnd.getMaxValue() + 1));
>>> +  return CR;
>>> +}
>>> +
>>> +
>>> +
>>>   /// Determine the range for a particular SCEV.  If SignHint is
>>>   /// HINT_RANGE_UNSIGNED (resp. HINT_RANGE_SIGNED) then getRange 
>>> prefers ranges
>>>   /// with a "cleaner" unsigned (resp. signed) representation.
>>> @@ -5845,12 +5915,19 @@ ScalarEvolution::getRangeRef(const SCEV *S,
>>>     }
>>>       if (const SCEVUnknown *U = dyn_cast<SCEVUnknown>(S)) {
>>> +
>>>       // Check if the IR explicitly contains !range metadata.
>>>       Optional<ConstantRange> MDRange = 
>>> GetRangeFromMetadata(U->getValue());
>>>       if (MDRange.hasValue())
>>>         ConservativeResult = 
>>> ConservativeResult.intersectWith(MDRange.getValue(),
>>> RangeType);
>>>   +    // Use facts about recurrences in the underlying IR. Note 
>>> that add
>>> +    // recurrences are AddRecExprs and thus don't hit this path.  This
>>> +    // primarily handles shift recurrences.
>>> +    auto CR = getRangeForUnknownRecurrence(U);
>>> +    ConservativeResult = ConservativeResult.intersectWith(CR);
>>> +
>>>       // See if ValueTracking can give us a useful range.
>>>       const DataLayout &DL = getDataLayout();
>>>       KnownBits Known = computeKnownBits(U->getValue(), DL, 0, &AC, 
>>> nullptr, &DT);
>>>
>>> diff  --git 
>>> a/llvm/test/Analysis/ScalarEvolution/shift-recurrences.ll 
>>> b/llvm/test/Analysis/ScalarEvolution/shift-recurrences.ll
>>> index 4eb9c9ea50a6..5f34ad42ab5f 100644
>>> --- a/llvm/test/Analysis/ScalarEvolution/shift-recurrences.ll
>>> +++ b/llvm/test/Analysis/ScalarEvolution/shift-recurrences.ll
>>> @@ -193,11 +193,11 @@ define void @test_shl2() {
>>>   ; CHECK-NEXT:    %iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
>>>   ; CHECK-NEXT:    --> {0,+,1}<%loop> U: [0,5) S: [0,5) Exits: 4 
>>> LoopDispositions: { %loop: Computable }
>>>   ; CHECK-NEXT:    %iv.shl = phi i64 [ 4, %entry ], [ %iv.shl.next, 
>>> %loop ]
>>> -; CHECK-NEXT:    --> %iv.shl U: [0,-3) S: 
>>> [-9223372036854775808,9223372036854775801) Exits: 64 
>>> LoopDispositions: { %loop: Variant }
>>> +; CHECK-NEXT:    --> %iv.shl U: [4,65) S: [4,65) Exits: 64 
>>> LoopDispositions: { %loop: Variant }
>>>   ; CHECK-NEXT:    %iv.next = add i64 %iv, 1
>>>   ; CHECK-NEXT:    --> {1,+,1}<%loop> U: [1,6) S: [1,6) Exits: 5 
>>> LoopDispositions: { %loop: Computable }
>>>   ; CHECK-NEXT:    %iv.shl.next = shl i64 %iv.shl, 1
>>> -; CHECK-NEXT:    --> (2 * %iv.shl) U: [0,-7) S: 
>>> [-9223372036854775808,9223372036854775801) Exits: 128 
>>> LoopDispositions: { %loop: Variant }
>>> +; CHECK-NEXT:    --> (2 * %iv.shl)<nuw><nsw> U: [8,129) S: [8,129) 
>>> Exits: 128 LoopDispositions: { %loop: Variant }
>>>   ; CHECK-NEXT:  Determining loop execution counts for: @test_shl2
>>>   ; CHECK-NEXT:  Loop %loop: backedge-taken count is 4
>>>   ; CHECK-NEXT:  Loop %loop: max backedge-taken count is 4
>>> @@ -227,7 +227,7 @@ define void @test_shl3(i1 %c) {
>>>   ; CHECK-NEXT:    %iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
>>>   ; CHECK-NEXT:    --> {0,+,1}<%loop> U: [0,5) S: [0,5) Exits: 4 
>>> LoopDispositions: { %loop: Computable }
>>>   ; CHECK-NEXT:    %iv.shl = phi i64 [ 4, %entry ], [ %iv.shl.next, 
>>> %loop ]
>>> -; CHECK-NEXT:    --> %iv.shl U: [0,-3) S: 
>>> [-9223372036854775808,9223372036854775805) Exits: <<Unknown>> 
>>> LoopDispositions: { %loop: Variant }
>>> +; CHECK-NEXT:    --> %iv.shl U: [4,65) S: [4,65) Exits: <<Unknown>> 
>>> LoopDispositions: { %loop: Variant }
>>>   ; CHECK-NEXT:    %iv.next = add i64 %iv, 1
>>>   ; CHECK-NEXT:    --> {1,+,1}<%loop> U: [1,6) S: [1,6) Exits: 5 
>>> LoopDispositions: { %loop: Computable }
>>>   ; CHECK-NEXT:    %iv.shl.next = shl i64 %iv.shl, %shiftamt
>>> @@ -260,11 +260,11 @@ define void @test_shl4() {
>>>   ; CHECK-NEXT:    %iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
>>>   ; CHECK-NEXT:    --> {0,+,1}<%loop> U: [0,61) S: [0,61) Exits: 60 
>>> LoopDispositions: { %loop: Computable }
>>>   ; CHECK-NEXT:    %iv.shl = phi i64 [ 4, %entry ], [ %iv.shl.next, 
>>> %loop ]
>>> -; CHECK-NEXT:    --> %iv.shl U: [0,-3) S: 
>>> [-9223372036854775808,9223372036854775801) Exits: 
>>> 4611686018427387904 LoopDispositions: { %loop: Variant }
>>> +; CHECK-NEXT:    --> %iv.shl U: [4,4611686018427387905) S: 
>>> [4,4611686018427387905) Exits: 4611686018427387904 LoopDispositions: 
>>> { %loop: Variant }
>>>   ; CHECK-NEXT:    %iv.next = add i64 %iv, 1
>>>   ; CHECK-NEXT:    --> {1,+,1}<%loop> U: [1,62) S: [1,62) Exits: 61 
>>> LoopDispositions: { %loop: Computable }
>>>   ; CHECK-NEXT:    %iv.shl.next = shl i64 %iv.shl, 1
>>> -; CHECK-NEXT:    --> (2 * %iv.shl) U: [0,-7) S: 
>>> [-9223372036854775808,9223372036854775801) Exits: 
>>> -9223372036854775808 LoopDispositions: { %loop: Variant }
>>> +; CHECK-NEXT:    --> (2 * %iv.shl)<nuw> U: [8,-9223372036854775807) 
>>> S: [-9223372036854775808,9223372036854775801) Exits: 
>>> -9223372036854775808 LoopDispositions: { %loop: Variant }
>>>   ; CHECK-NEXT:  Determining loop execution counts for: @test_shl4
>>>   ; CHECK-NEXT:  Loop %loop: backedge-taken count is 60
>>>   ; CHECK-NEXT:  Loop %loop: max backedge-taken count is 60
>>> @@ -324,7 +324,7 @@ define void @test_shl6(i1 %c) {
>>>   ; CHECK-NEXT:    %iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
>>>   ; CHECK-NEXT:    --> {0,+,1}<%loop> U: [0,5) S: [0,5) Exits: 4 
>>> LoopDispositions: { %loop: Computable }
>>>   ; CHECK-NEXT:    %iv.shl = phi i64 [ 4, %entry ], [ %iv.shl.next, 
>>> %loop ]
>>> -; CHECK-NEXT:    --> %iv.shl U: [0,-3) S: 
>>> [-9223372036854775808,9223372036854775805) Exits: 16 
>>> LoopDispositions: { %loop: Variant }
>>> +; CHECK-NEXT:    --> %iv.shl U: [4,65) S: [4,65) Exits: 16 
>>> LoopDispositions: { %loop: Variant }
>>>   ; CHECK-NEXT:    %iv.next = add i64 %iv, 1
>>>   ; CHECK-NEXT:    --> {1,+,1}<%loop> U: [1,6) S: [1,6) Exits: 5 
>>> LoopDispositions: { %loop: Computable }
>>>   ; CHECK-NEXT:    %shiftamt = and i64 %iv, 1
>>>
>>>
>>>          _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits