[llvm-dev] SCCP is not always correct in presence of undef (+ proposed fix)

Sean Silva via llvm-dev llvm-dev at lists.llvm.org
Fri Dec 30 19:41:52 PST 2016


On Fri, Dec 30, 2016 at 9:27 AM, Davide Italiano via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Hi.
> I'm sending this email to -dev as this may be of interest of
> many/people may have opinions/want to try the change before it goes in
> to report problems.
> I've been recently working on a patch to integrate `undef` in the SCCP
> solver, in the hope of fixing a tail of latent bugs in SCCP which
> remained uncovered for many years. I think this is a decent time to
> propose, so that it can (hopefully) be integrated in the 4.0 branch
> (or backported, FWIW), if it gets reviewed and there are no
> substantial problems.
>
> #### Background:
> The current lattice for SCCP doesn't know about `undef`, that is, it's
> identical to the one described in [1]. There are three lattice states,
> "Unknown"/"Constant"/"Overdefined" (the first one is what the paper
> calls "Undefined", but was renamed as "undefined" in LLVM has a
> different meaning).
>
> SCCP (or its interprocedural counterpart) currently consists of two phases:
> 1) The solver, which implements the algorithm described in the paper.
> At the end of this phase, if there are no `undef` values in the input,
> all the values supposedly have been lowered (in a lattice-theoretical
> sense) to either constant or overdefined. In presence of `undef`,
> there may be values which are still in the 'unknown' state.
> 2) The resolver assumes that everything in the 'unknown' state is
> undefined, and runs a procedure trying to plug the "correct" value,
> and in case something changed, runs the solver again.
>
> #### Problem/Motivation for the change:
> 1) and 2) use different orders of visitation. 1) walks the use-def
> chains, 2) visit the function in basic block order (starting from the
> entry block of the function).
> The fact that 2) walks over the function instead of the SSA graph
> creates some issues where it cannot distinguish between the following
> two cases:
>
> a) It doesn't know how to resolve this value
> b) This value is really undef.
>
> and may plug in wrong/invalid values. For a real example, see the bug
> in [3], for a proof of why it's wrong, see [2] (courtesy of Daniel
> Berlin).
>
> #### Proposed fix:
> Integrate `undef` in the solver. Undef is a lattice value lower than
> `Unknown` but higher than `Overdefined` (at the same lattice level of
> constant). Graphically:
>
>    unknown
>     /     \      \
>    /       \      \
>  c1 ...  ck undefined
>   \         |       /
>    \        |      /
>    overdefined
>
>
> #### Benefits:
> -> The code now is correct (modulo other bugs =)) in presence of undefs
> -> This allows us to keep the same lattice height and therefore all
> the nice termination/convergence properties/bounds (see the paper for
> reference).
> -> This allows to remove a lot of code (pretty much the resolver) and
> makes the algorithm easier to understand
>

How does this eliminate the need for the resolver? Without the resolver,
what will plug in values for the undef's?


> -> SCCP doesn't need anymore to roll its own tiny constant folder
> where we need to add cases everytime we realize there's something
> missing, instead, it can leverage the full power of ConstantFolding
> when something folds to undef.
>

Can you elaborate on why SCCP currently needs to "roll its own tiny
constant folder"? I wasn't able to understand it from just the context here.

-- Sean Silva


> -> Makes SCCP *slightly* faster.
>
> #### (Potential) drawbacks:
> -> This change the behavior of `phi` handling, i.e. we lose some cases
> as we force phi to overdefined if not the values are constant. I
> measured the impact on the llvm-testsuite and on some internal
> benchmarks and I wasn't able to measure any substantial different in
> runtime performances. If you care about this case, please speak up =)
> (or try the patch and report regressions). There are probably ways to
> make this working but we (see [4], comment 7 and 8) think the gain in
> precision is not worth the complexity (and what's here is *much much*
> better than what's in the tree as it tries at least to be correct).
>
> #### Plan for integration
> The patch is available at https://reviews.llvm.org/D28177
> Any comments/feedback/testing is definitely welcome. This is a
> self-contained change.
>
> #### (Possible) future work
> If this goes in I'd like to implement constant propagation for missing
> IR constructs (in particular vectors), and make sure SCCP is on par
> with the other two costant propagation passes we have in llvm (and
> maybe garbage collect them).
> Danny suggested an improvement could be that of propagating facts in
> both directions (GCC does that as a separate pass
> https://gcc.gnu.org/svn/gcc/trunk/gcc/gimple-ssa-backprop.c). I have
> no immediate plans to work on this (I suspect NewGVN and other places
> will keep me busy for a while), but I hope to eventually get there.
>
> #### Thanks
> Eli Friedman and Daniel Berlin provided very insightful
> conversations/suggestions on how to tackle the problems, and provided
> early feedback on the change (other than reviewing large part if not
> all my previous work on SCCP).
>
>
> [1] https://www.cs.utexas.edu/users/lin/cs380c/wegman.pdf
> [2] http://lists.llvm.org/pipermail/llvm-commits/Week-
> of-Mon-20161107/403212.html
> [3] https://llvm.org/bugs/show_bug.cgi?id=30448
> [4] https://llvm.org/bugs/show_bug.cgi?id=30966#c8
>
> --
> Davide
>
> "There are no solved problems; there are only problems that are more
> or less solved" -- Henri Poincare
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161230/119caa19/attachment.html>


More information about the llvm-dev mailing list