[PATCH] D12073: Make ScalarEvolution::isKnownPredicate a little smarter
hfinkel@anl.gov via llvm-commits
llvm-commits at lists.llvm.org
Mon Aug 17 23:51:52 PDT 2015
hfinkel added a comment.
In http://reviews.llvm.org/D12073#226385, @sanjoy wrote:
> In http://reviews.llvm.org/D12073#226151, @hfinkel wrote:
>
> >
>
>
>
>
> > There are now three failing tests, mostly due to IV widening that now does not happen. The new known-predicate results are:
>
> >
>
> > test/Transforms/IndVarSimplify/elim-extend.ll:
>
> >
>
> > - {%init,+,1}<nsw><%loop> sle {(1 + %init),+,1}<nsw><%loop>
>
> > - {(sext i32 %init to i64),+,1}<nsw><%loop> sle {(1 + (sext i32 %init to i64)),+,1}<nsw><%loop>
>
> > - {-1,+,1}<nsw><%outerloop> sle {1,+,1}<nuw><%outerloop>
>
> > - {-1,+,1}<nsw><%outerloop> sle {1,+,1}<nuw><nsw><%outerloop>
>
> >
>
> > test/Transforms/IndVarSimplify/iv-sext.ll:
>
> > - {-1,+,1}<nsw><%bb> sle {1,+,1}<nuw><%bb>
>
> > - {-1,+,1}<nsw><%bb> sle {1,+,1}<nuw><nsw><%bb>
>
> >
>
> > test/Transforms/IndVarSimplify/strengthen-overflow.ll:
>
> > - {%init,+,1}<nsw><%loop> sle {(1 + %init),+,1}<nsw><%loop>
>
> >
>
> > These look right, but some consumer of the information seems to make a worse no-wrap deductions as a result (thus, this patch is still WIP).
>
>
> I suspect (haven't verified) this is due to SCEV's caching mechanism. You end up creating a "Sext(LHS)" too soon, before proving all of the things needed to show that LHS is nsw. Then this suboptimal result gets cached.
>
> In other words, I suspect the stack trace looks like this (youngest frame at top) initially:
>
> == false because of PendingLoopPredicates, set in frame (A)
> getImpliedCond(...) // ends up looking at VAL
> getLoopBackedgeCount(...)
> getSignExtend(VAL) // due to your change
> getImpliedCond(...) // ends up looking at VAL .. (A)
> getLoopBackedgeCount(...)
> getSignExtend(VAL)
>
>
> This goes on as
>
> getSignExtend(VAL) ==> no simplification, and we end up with
> UniqueSCEVs[(SignExtend, VAL)] = SignExtendExpr VAL, and
> does not set the <nsw> bit in VAL
> getImpliedCond(...) // ends up looking at VAL .. (A)
> getLoopBackedgeCount(...)
> getSignExtend(VAL)
>
>
> And then finally the oldest frame returns:
>
> getSignExtend(VAL) // this returns a widened version of VAL (instead of
> an explicit sign extend), and also set VAL's nsw bit
>
>
> However, the damage has been done since `UniqueSCEVs[(SignExtend, VAL)] == (SignExtendExpr VAL)` and that's the value `ScalarEvolution::getSignExtendExpr` will return from now on.
>
> If you implement your check without actually creating a sign extend, but by directly inferring no-overflow from the nsw or nuw flags, then I suspect this won't happen.
Looking at the test/Transforms/IndVarSimplify/strengthen-overflow.ll test case, for example, here's (in part) what is going on, at least in terms of initial behavior:
SimplifyIndvar::strengthenOverflowingOperation has code that looks like this:
if (!BO->hasNoUnsignedWrap()) {
const SCEV *ExtendAfterOp = SE->getZeroExtendExpr(SE->getSCEV(BO), WideTy);
const SCEV *OpAfterExtend = (SE->*GetExprForBO)(
SE->getZeroExtendExpr(LHS, WideTy), SE->getZeroExtendExpr(RHS, WideTy),
SCEV::FlagAnyWrap);
if (ExtendAfterOp == OpAfterExtend) {
BO->setHasNoUnsignedWrap();
SE->forgetValue(BO);
Changed = true;
}
}
if (!BO->hasNoSignedWrap()) {
const SCEV *ExtendAfterOp = SE->getSignExtendExpr(SE->getSCEV(BO), WideTy);
const SCEV *OpAfterExtend = (SE->*GetExprForBO)(
SE->getSignExtendExpr(LHS, WideTy), SE->getSignExtendExpr(RHS, WideTy),
SCEV::FlagAnyWrap);
if (ExtendAfterOp == OpAfterExtend) {
BO->setHasNoSignedWrap();
SE->forgetValue(BO);
Changed = true;
}
}
Now the fun part is that the new code actually triggers as part of the checks done under `if (!BO->hasNoUnsignedWrap())`, but we can't prove nuw regardless, but it changes things such that the code under `if (!BO->hasNoSignedWrap())` no longer works as well as it did previously.
We start with LHS = {%init,+,1}<%loop>, RHS = 1, and as part of computing OpAfterExtend (in the unsigned case), we hit this part of ScalarEvolution::getSignExtendExpr:
// If the backedge is guarded by a comparison with the pre-inc value
// the addrec is safe. Also, if the entry is guarded by a comparison
// with the start value and the backedge is guarded by a comparison
// with the post-inc value, the addrec is safe.
ICmpInst::Predicate Pred;
const SCEV *OverflowLimit =
getSignedOverflowLimitForStep(Step, &Pred, this);
if (OverflowLimit &&
(isLoopBackedgeGuardedByCond(L, Pred, AR, OverflowLimit) ||
(isLoopEntryGuardedByCond(L, Pred, Start, OverflowLimit) &&
isLoopBackedgeGuardedByCond(L, Pred, AR->getPostIncExpr(*this),
OverflowLimit)))) {
// Cache knowledge of AR NSW, then propagate NSW to the wide AddRec.
const_cast<SCEVAddRecExpr *>(AR)->setNoWrapFlags(SCEV::FlagNSW);
return getAddRecExpr(
getExtendAddRecStart<SCEVSignExtendExpr>(AR, Ty, this),
getSignExtendExpr(Step, Ty), L, AR->getNoWrapFlags());
}
}
and this cached nsw flag ends up on our RHS, so we now have:
LHS = {%init,+,1}<nsw><%loop>, RHS = 1
and so the call to `SE->getSignExtendExpr(LHS, WideTy)` now returns:
i32 {%init,+,1}<nsw><%loop> to i64)
instead of returning:
{(sext i32 %init to i64),+,1}<nsw><%loop> (as it does without this patch).
as a result, we get:
ExtendAfterOp == {(1 + (sext i32 %init to i64)),+,1}<nsw><%loop>
OpAfterExtend == (1 + (sext i32 {%init,+,1}<nsw><%loop> to i64))
instead of:
ExtendAfterOp == OpAfterExtend == {(1 + (sext i32 %init to i64)),+,1}<nsw><%loop>
interestingly, the results in the nuw case are also slightly different (although in neither case are equal because we can't provide nuw regardless). With the patch we get this:
ExtendAfterOp == (zext i32 {(1 + %init),+,1}<nsw><%loop> to i64)
OpAfterExtend == (1 + (zext i32 {%init,+,1}<nsw><%loop> to i64))
whereas currently we get:
ExtendAfterOp == (zext i32 {(1 + %init),+,1}<%loop> to i64)
OpAfterExtend == (1 + (zext i32 {%init,+,1}<%loop> to i64))
(so we get some extra <nsw>s for our trouble)
http://reviews.llvm.org/D12073
More information about the llvm-commits
mailing list