[llvm] r303730 - [SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start
Maxim Kazantsev via llvm-commits
llvm-commits at lists.llvm.org
Wed May 24 08:20:46 PDT 2017
Hi Diana,
Thanks for your letter. It seems that the test failed on ARM configs while it passes on X86. It works fine for me in X86 config. The only idea I have is that the data layout (which is specified for this test as X86) somehow affects Opt's behavior. Unfortunately I don't have an ARM device to reproduce it. Could you please be so kind to check if the removal of that data layout line from the failing test helps for ARM?
Regards,
Max
-----Original Message-----
From: Diana Picus [mailto:diana.picus at linaro.org]
Sent: Wednesday, May 24, 2017 9:20 PM
To: Maxim Kazantsev <max.kazantsev at azul.com>
Cc: llvm-commits <llvm-commits at lists.llvm.org>
Subject: Re: [llvm] r303730 - [SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start
Hi Max,
I reverted this because it broke the buildbots. See e.g.
http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/7414
http://lab.llvm.org:8011/builders/clang-hexagon-elf/builds/8403
Did you get any emails from the bots?
Let me know if you need help reproducing the failure or if you want me to run a pre-commit before applying it again.
Cheers,
Diana
On 24 May 2017 at 10:52, Max Kazantsev via llvm-commits <llvm-commits at lists.llvm.org> wrote:
> Author: mkazantsev
> Date: Wed May 24 03:52:18 2017
> New Revision: 303730
>
> URL: http://llvm.org/viewvc/llvm-project?rev=303730&view=rev
> Log:
> [SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start
>
> When folding arguments of AddExpr or MulExpr with recurrences, we rely
> on the fact that the loop of our base recurrency is the bottom-lost in
> terms of domination. This assumption may be broken by an expression
> which is treated as invariant, and which depends on a complex Phi for
> which SCEVUnknown was created. If such Phi is a loop Phi, and this loop is lower than the chosen AddRecExpr's loop, it is invalid to fold our expression with the recurrence.
>
> Another reason why it might be invalid to fold SCEVUnknown into Phi
> start value is that unlike other SCEVs, SCEVUnknown are sometimes position-bound. For example, here:
>
> for (...) { // loop
> phi = {A,+,B}
> }
> X = load ...
> Folding phi + X into {A+X,+,B}<loop> actually makes no sense, because
> X does not exist and cannot exist while we are iterating in loop (this memory can be even not allocated and not filled by this moment).
> It is only valid to make such folding if X is defined before the loop.
> In this case the recurrence {A+X,+,B}<loop> may be existant.
>
> This patch prohibits folding of SCEVUnknown (and those who use them)
> into the start value of an AddRecExpr, if this instruction is
> dominated by the loop. Merging the dominating unknown values is still
> valid. Some tests that relied on the fact that some SCEVUnknown should be folded into AddRec's are changed so that they no longer expect such behavior.
>
> Modified:
> llvm/trunk/include/llvm/Analysis/ScalarEvolution.h
> llvm/trunk/lib/Analysis/ScalarEvolution.cpp
> llvm/trunk/test/Analysis/IVUsers/quadradic-exit-value.ll
> llvm/trunk/test/Analysis/ScalarEvolution/different-loops-recs.ll
> llvm/trunk/test/Transforms/LoopStrengthReduce/X86/incorrect-offset-scaling.ll
> llvm/trunk/test/Transforms/LoopStrengthReduce/lsr-expand-quadratic.ll
> llvm/trunk/test/Transforms/LoopStrengthReduce/post-inc-icmpzero.ll
>
> Modified: llvm/trunk/include/llvm/Analysis/ScalarEvolution.h
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/S
> calarEvolution.h?rev=303730&r1=303729&r2=303730&view=diff
> ======================================================================
> ========
> --- llvm/trunk/include/llvm/Analysis/ScalarEvolution.h (original)
> +++ llvm/trunk/include/llvm/Analysis/ScalarEvolution.h Wed May 24
> +++ 03:52:18 2017
> @@ -1533,6 +1533,12 @@ public:
> /// specified loop.
> bool isLoopInvariant(const SCEV *S, const Loop *L);
>
> + /// Determine if the SCEV can be evaluated at loop's entry. It is
> + true if it /// doesn't depend on a SCEVUnknown of an instruction
> + which is dominated by /// the header of loop L.
> + bool isAvailableAtLoopEntry(const SCEV *S, const Loop *L, DominatorTree &DT,
> + LoopInfo &LI);
> +
> /// Return true if the given SCEV changes value in a known way in the
> /// specified loop. This property being true implies that the value is
> /// variant in the loop AND that we can emit an expression to
> compute the
>
> Modified: llvm/trunk/lib/Analysis/ScalarEvolution.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ScalarEvol
> ution.cpp?rev=303730&r1=303729&r2=303730&view=diff
> ======================================================================
> ========
> --- llvm/trunk/lib/Analysis/ScalarEvolution.cpp (original)
> +++ llvm/trunk/lib/Analysis/ScalarEvolution.cpp Wed May 24 03:52:18
> +++ 2017
> @@ -2178,6 +2178,63 @@ StrengthenNoWrapFlags(ScalarEvolution *S
> return Flags;
> }
>
> +bool ScalarEvolution::isAvailableAtLoopEntry(const SCEV *S, const Loop *L,
> + DominatorTree &DT,
> +LoopInfo &LI) {
> + if (!isLoopInvariant(S, L))
> + return false;
> + // If a value depends on a SCEVUnknown which is defined after the
> +loop, we
> + // conservatively assume that we cannot calculate it at the loop's entry.
> + struct FindDominatedSCEVUnknown {
> + bool Found = false;
> + const Loop *L;
> + DominatorTree &DT;
> + LoopInfo &LI;
> +
> + FindDominatedSCEVUnknown(const Loop *L, DominatorTree &DT, LoopInfo &LI)
> + : L(L), DT(DT), LI(LI) {}
> +
> + bool checkSCEVUnknown(const SCEVUnknown *SU) {
> + if (auto *I = dyn_cast<Instruction>(SU->getValue())) {
> + if (DT.dominates(L->getHeader(), I->getParent()))
> + Found = true;
> + else
> + assert(DT.dominates(I->getParent(), L->getHeader()) &&
> + "No dominance relationship between SCEV and loop?");
> + }
> + return false;
> + }
> +
> + bool follow(const SCEV *S) {
> + switch (static_cast<SCEVTypes>(S->getSCEVType())) {
> + case scConstant:
> + return false;
> + case scAddRecExpr:
> + case scTruncate:
> + case scZeroExtend:
> + case scSignExtend:
> + case scAddExpr:
> + case scMulExpr:
> + case scUMaxExpr:
> + case scSMaxExpr:
> + case scUDivExpr:
> + return true;
> + case scUnknown:
> + return checkSCEVUnknown(cast<SCEVUnknown>(S));
> + case scCouldNotCompute:
> + llvm_unreachable("Attempt to use a SCEVCouldNotCompute object!");
> + }
> + return false;
> + }
> +
> + bool isDone() { return Found; }
> + };
> +
> + FindDominatedSCEVUnknown FSU(L, DT, LI);
> + SCEVTraversal<FindDominatedSCEVUnknown> ST(FSU);
> + ST.visitAll(S);
> + return !FSU.Found;
> +}
> +
> /// Get a canonical add expression, or something simpler if possible.
> const SCEV *ScalarEvolution::getAddExpr(SmallVectorImpl<const SCEV *> &Ops,
> SCEV::NoWrapFlags Flags, @@
> -2459,7 +2516,7 @@ const SCEV *ScalarEvolution::getAddExpr(
> const SCEVAddRecExpr *AddRec = cast<SCEVAddRecExpr>(Ops[Idx]);
> const Loop *AddRecLoop = AddRec->getLoop();
> for (unsigned i = 0, e = Ops.size(); i != e; ++i)
> - if (isLoopInvariant(Ops[i], AddRecLoop)) {
> + if (isAvailableAtLoopEntry(Ops[i], AddRecLoop, DT, LI)) {
> LIOps.push_back(Ops[i]);
> Ops.erase(Ops.begin()+i);
> --i; --e;
> @@ -2734,7 +2791,7 @@ const SCEV *ScalarEvolution::getMulExpr(
> const SCEVAddRecExpr *AddRec = cast<SCEVAddRecExpr>(Ops[Idx]);
> const Loop *AddRecLoop = AddRec->getLoop();
> for (unsigned i = 0, e = Ops.size(); i != e; ++i)
> - if (isLoopInvariant(Ops[i], AddRecLoop)) {
> + if (isAvailableAtLoopEntry(Ops[i], AddRecLoop, DT, LI)) {
> LIOps.push_back(Ops[i]);
> Ops.erase(Ops.begin()+i);
> --i; --e;
>
> Modified: llvm/trunk/test/Analysis/IVUsers/quadradic-exit-value.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/IVUsers/q
> uadradic-exit-value.ll?rev=303730&r1=303729&r2=303730&view=diff
> ======================================================================
> ========
> --- llvm/trunk/test/Analysis/IVUsers/quadradic-exit-value.ll
> (original)
> +++ llvm/trunk/test/Analysis/IVUsers/quadradic-exit-value.ll Wed May
> +++ 24 03:52:18 2017
> @@ -30,13 +30,47 @@ exit:
> ret i64 %r
> }
>
> +; PR15470: LSR miscompile. The test1 function should return '1'.
> +; It is valid to fold SCEVUnknown into the recurrence because it ;
> +was defined before the loop.
> +;
> +; SCEV does not know how to denormalize chained recurrences, so make
> +; sure they aren't marked as post-inc users.
> +;
> +; CHECK-LABEL: IV Users for loop %test1.loop
> +; CHECK-NO-LCSSA: %sext.us = {0,+,(16777216 + (-16777216 * %sub.us))<nuw><nsw>,+,33554432}<%test1.loop> (post-inc with loop %test1.loop) in %f = ashr i32 %sext.us, 24
> +define i32 @test1(i1 %cond) {
> +entry:
> + %sub.us = select i1 %cond, i32 0, i32 0
> + br label %test1.loop
> +
> +test1.loop:
> + %inc1115.us = phi i32 [ 0, %entry ], [ %inc11.us, %test1.loop ]
> + %inc11.us = add nsw i32 %inc1115.us, 1
> + %cmp.us = icmp slt i32 %inc11.us, 2
> + br i1 %cmp.us, label %test1.loop, label %for.end
> +
> +for.end:
> + %tobool.us = icmp eq i32 %inc1115.us, 0
> + %mul.us = shl i32 %inc1115.us, 24
> + %sub.cond.us = sub nsw i32 %inc1115.us, %sub.us
> + %sext.us = mul i32 %mul.us, %sub.cond.us
> + %f = ashr i32 %sext.us, 24
> + br label %exit
> +
> +exit:
> + ret i32 %f
> +}
> +
> ; PR15470: LSR miscompile. The test2 function should return '1'.
> +; It is illegal to fold SCEVUnknown (sext.us) into the recurrence ;
> +because it is defined after the loop where this recurrence belongs.
> ;
> ; SCEV does not know how to denormalize chained recurrences, so make
> ; sure they aren't marked as post-inc users.
> ;
> ; CHECK-LABEL: IV Users for loop %test2.loop
> -; CHECK-NO-LCSSA: %sext.us = {0,+,(16777216 + (-16777216 * %sub.us))<nuw><nsw>,+,33554432}<%test2.loop> (post-inc with loop %test2.loop) in %f = ashr i32 %sext.us, 24
> +; CHECK-NO-LCSSA: %sub.cond.us = ((-1 * %sub.us)<nsw> + {0,+,1}<nuw><nsw><%test2.loop>) (post-inc with loop %test2.loop) in %sext.us = mul i32 %mul.us, %sub.cond.us
> define i32 @test2() {
> entry:
> br label %test2.loop
>
> Modified:
> llvm/trunk/test/Analysis/ScalarEvolution/different-loops-recs.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/ScalarEvo
> lution/different-loops-recs.ll?rev=303730&r1=303729&r2=303730&view=dif
> f
> ======================================================================
> ========
> --- llvm/trunk/test/Analysis/ScalarEvolution/different-loops-recs.ll
> (original)
> +++ llvm/trunk/test/Analysis/ScalarEvolution/different-loops-recs.ll
> +++ Wed May 24 03:52:18 2017
> @@ -220,7 +220,8 @@ exit:
>
> ; Mix of previous use cases that demonstrates %s3 can be incorrectly
> treated as ; a recurrence of loop1 because of operands order if we
> pick recurrencies in an -; incorrect order.
> +; incorrect order. It also shows that we cannot safely fold v1
> +(SCEVUnknown) ; because we cannot prove for sure that it doesn't use Phis of loop 2.
>
> define void @test_03(i32 %a, i32 %b, i32 %c, i32* %p) {
>
> @@ -228,9 +229,9 @@ define void @test_03(i32 %a, i32 %b, i32
> ; CHECK: %v1 = load i32, i32* %p
> ; CHECK-NEXT: --> %v1
> ; CHECK: %s1 = add i32 %phi1, %v1
> -; CHECK-NEXT: --> {(%a + %v1),+,1}<%loop1>
> +; CHECK-NEXT: --> ({%a,+,1}<%loop1> + %v1)
> ; CHECK: %s2 = add i32 %s1, %b
> -; CHECK-NEXT: --> {(%a + %b + %v1),+,1}<%loop1>
> +; CHECK-NEXT: --> ({(%a + %b),+,1}<%loop1> + %v1)
> ; CHECK: %s3 = add i32 %s2, %phi2
> ; CHECK-NEXT: --> ({{{{}}((2 * %a) + %b),+,1}<%loop1>,+,2}<%loop2>
> + %v1)
>
> @@ -452,3 +453,60 @@ exit:
> %s6 = add i32 %phi3, %phi2
> ret void
> }
> +
> +; Make sure that a complicated Phi does not get folded with rec's
> +start value ; of a loop which is above.
> +define void @test_08() {
> +
> +; CHECK-LABEL: Classifying expressions for: @test_08
> +; CHECK: %tmp11 = add i64 %iv.2.2, %iv.2.1
> +; CHECK-NEXT: --> ({0,+,-1}<nsw><%loop_2> + %iv.2.1)
> +; CHECK: %tmp12 = trunc i64 %tmp11 to i32
> +; CHECK-NEXT: --> (trunc i64 ({0,+,-1}<nsw><%loop_2> + %iv.2.1) to i32)
> +; CHECK: %tmp14 = mul i32 %tmp12, %tmp7
> +; CHECK-NEXT: --> ((trunc i64 ({0,+,-1}<nsw><%loop_2> + %iv.2.1) to i32) * {-1,+,-1}<%loop_1>)
> +; CHECK: %tmp16 = mul i64 %iv.2.1, %iv.1.1
> +; CHECK-NEXT: --> ({2,+,1}<nuw><nsw><%loop_1> * %iv.2.1)
> +
> +entry:
> + br label %loop_1
> +
> +loop_1:
> + %iv.1.1 = phi i64 [ 2, %entry ], [ %iv.1.1.next,
> +%loop_1_back_branch ]
> + %iv.1.2 = phi i32 [ -1, %entry ], [ %iv.1.2.next,
> +%loop_1_back_branch ]
> + br label %loop_1_exit
> +
> +dead:
> + br label %loop_1_exit
> +
> +loop_1_exit:
> + %tmp5 = icmp sgt i64 %iv.1.1, 2
> + br i1 %tmp5, label %loop_2_preheader, label %loop_1_back_branch
> +
> +loop_1_back_branch:
> + %iv.1.1.next = add nuw nsw i64 %iv.1.1, 1
> + %iv.1.2.next = add nsw i32 %iv.1.2, 1
> + br label %loop_1
> +
> +loop_2_preheader:
> + %tmp6 = sub i64 1, %iv.1.1
> + %tmp7 = trunc i64 %tmp6 to i32
> + br label %loop_2
> +
> +loop_2:
> + %iv.2.1 = phi i64 [ 0, %loop_2_preheader ], [ %tmp16, %loop_2 ]
> + %iv.2.2 = phi i64 [ 0, %loop_2_preheader ], [ %iv.2.2.next, %loop_2
> +]
> + %iv.2.3 = phi i64 [ 2, %loop_2_preheader ], [ %iv.2.3.next, %loop_2
> +]
> + %tmp11 = add i64 %iv.2.2, %iv.2.1
> + %tmp12 = trunc i64 %tmp11 to i32
> + %tmp14 = mul i32 %tmp12, %tmp7
> + %tmp16 = mul i64 %iv.2.1, %iv.1.1
> + %iv.2.3.next = add nuw nsw i64 %iv.2.3, 1
> + %iv.2.2.next = add nsw i64 %iv.2.2, -1
> + %tmp17 = icmp slt i64 %iv.2.3.next, %iv.1.1
> + br i1 %tmp17, label %loop_2, label %exit
> +
> +exit:
> + %tmp10 = add i32 %iv.1.2, 3
> + ret void
> +}
>
> Modified:
> llvm/trunk/test/Transforms/LoopStrengthReduce/X86/incorrect-offset-sca
> ling.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopStr
> engthReduce/X86/incorrect-offset-scaling.ll?rev=303730&r1=303729&r2=30
> 3730&view=diff
> ======================================================================
> ========
> ---
> llvm/trunk/test/Transforms/LoopStrengthReduce/X86/incorrect-offset-sca
> ling.ll (original)
> +++ llvm/trunk/test/Transforms/LoopStrengthReduce/X86/incorrect-offset
> +++ -scaling.ll Wed May 24 03:52:18 2017
> @@ -25,7 +25,7 @@ L2:
> if6: ; preds = %idxend.8
> %r2 = add i64 %0, -1
> %r3 = load i64, i64* %1, align 8
> -; CHECK-NOT: %r2
> +; CHECK: %r2 = add i64 %0, -1
> ; CHECK: %r3 = load i64
> br label %ib
>
> @@ -36,13 +36,11 @@ ib:
> %r4 = mul i64 %r3, %r0
> %r5 = add i64 %r2, %r4
> %r6 = icmp ult i64 %r5, undef
> -; CHECK: [[MUL1:%[0-9]+]] = mul i64 %lsr.iv, %r3 -; CHECK:
> [[ADD1:%[0-9]+]] = add i64 [[MUL1]], -1 -; CHECK: add i64 %{{.}},
> [[ADD1]] -; CHECK: %r6
> +; CHECK: %r4 = mul i64 %r3, %lsr.iv
> +; CHECK: %r5 = add i64 %r2, %r4
> +; CHECK: %r6 = icmp ult i64 %r5, undef ; CHECK: %r7 = getelementptr
> +i64, i64* undef, i64 %r5
> %r7 = getelementptr i64, i64* undef, i64 %r5
> store i64 1, i64* %r7, align 8
> -; CHECK: [[MUL2:%[0-9]+]] = mul i64 %lsr.iv, %r3 -; CHECK:
> [[ADD2:%[0-9]+]] = add i64 [[MUL2]], -1
> br label %L
> }
>
> Modified:
> llvm/trunk/test/Transforms/LoopStrengthReduce/lsr-expand-quadratic.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopStr
> engthReduce/lsr-expand-quadratic.ll?rev=303730&r1=303729&r2=303730&vie
> w=diff
> ======================================================================
> ========
> ---
> llvm/trunk/test/Transforms/LoopStrengthReduce/lsr-expand-quadratic.ll
> (original)
> +++ llvm/trunk/test/Transforms/LoopStrengthReduce/lsr-expand-quadratic
> +++ .ll Wed May 24 03:52:18 2017
> @@ -7,16 +7,23 @@ target triple = "x86_64-apple-macosx"
> ;
> ; SCEV expander cannot expand quadratic recurrences outside of the ;
> loop. This recurrence depends on %sub.us, so can't be expanded.
> +; We cannot fold SCEVUnknown (sub.us) with recurrences since it is ;
> +declared after the loop.
> ;
> ; CHECK-LABEL: @test2
> ; CHECK-LABEL: test2.loop:
> -; CHECK: %lsr.iv = phi i32 [ %lsr.iv.next, %test2.loop ], [
> -16777216, %entry ] -; CHECK: %lsr.iv.next = add nsw i32 %lsr.iv,
> 16777216
> +; CHECK: %lsr.iv1 = phi i32 [ %lsr.iv.next2, %test2.loop ], [
> +-16777216, %entry ] ; CHECK: %lsr.iv = phi i32 [ %lsr.iv.next,
> +%test2.loop ], [ -1, %entry ] ; CHECK: %lsr.iv.next = add nsw i32
> +%lsr.iv, 1 ; CHECK: %lsr.iv.next2 = add nsw i32 %lsr.iv1, 16777216
> ;
> ; CHECK-LABEL: for.end:
> -; CHECK: %sub.cond.us = sub nsw i32 %inc1115.us, %sub.us -; CHECK:
> %sext.us = mul i32 %lsr.iv.next, %sub.cond.us -; CHECK: %f = ashr i32
> %sext.us, 24
> +; CHECK: %tobool.us = icmp eq i32 %lsr.iv.next2, 0 ; CHECK: %sub.us
> += select i1 %tobool.us, i32 0, i32 0 ; CHECK: %1 = sub i32 0,
> +%sub.us ; CHECK: %2 = add i32 %1, %lsr.iv.next ; CHECK: %sext.us =
> +mul i32 %lsr.iv.next2, %2 ; CHECK: %f = ashr i32 %sext.us, 24
> ; CHECK: ret i32 %f
> define i32 @test2() {
> entry:
>
> Modified:
> llvm/trunk/test/Transforms/LoopStrengthReduce/post-inc-icmpzero.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopStr
> engthReduce/post-inc-icmpzero.ll?rev=303730&r1=303729&r2=303730&view=d
> iff
> ======================================================================
> ========
> --- llvm/trunk/test/Transforms/LoopStrengthReduce/post-inc-icmpzero.ll
> (original)
> +++ llvm/trunk/test/Transforms/LoopStrengthReduce/post-inc-icmpzero.ll
> +++ Wed May 24 03:52:18 2017
> @@ -25,6 +25,8 @@ define void @_Z15IntegerToStringjjR7Vect
> entry:
> %buffer = alloca [33 x i16], align 16
> %add.ptr = getelementptr inbounds [33 x i16], [33 x i16]* %buffer,
> i64 0, i64 33
> + %sub.ptr.lhs.cast = ptrtoint i16* %add.ptr to i64
> + %sub.ptr.rhs.cast = ptrtoint i16* %add.ptr to i64
> br label %do.body
>
> do.body: ; preds = %do.body, %entry
> @@ -46,8 +48,6 @@ do.body:
> do.end: ; preds = %do.body
> %xap.0 = inttoptr i64 %0 to i1*
> %cap.0 = ptrtoint i1* %xap.0 to i64
> - %sub.ptr.lhs.cast = ptrtoint i16* %add.ptr to i64
> - %sub.ptr.rhs.cast = ptrtoint i16* %incdec.ptr to i64
> %sub.ptr.sub = sub i64 %sub.ptr.lhs.cast, %sub.ptr.rhs.cast
> %sub.ptr.div39 = lshr exact i64 %sub.ptr.sub, 1
> %conv11 = trunc i64 %sub.ptr.div39 to i32
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list