<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jan 3, 2017 at 5:43 AM, Hal Finkel via llvm-commits <span dir="ltr"><<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Philip,<br>

<br>

First, I'm planning on rewriting this change to be more targeted, specifically to take the alternative approach we discussed briefly on the review thread and keep the list of affected variables in the assumption cache.</blockquote><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

Second, regarding motivation, the problem is not the basic-block scan (with the assumption cache, we actually do very little of that anyway). The core problem is that, in ValueTracking, which is used heavily by InstCombine and other passes, checks all assumptions in each function for every variable being processed to determine known bits, etc. These checks are pattern matches, and while they normally fail quickly, as the number of assumptions increases, this is a large problem: it makes known-bits determination O(#assumptions in the function), for every variable, in the common case, and, because the assumption processing is part of the recursive known-bits procedure, the constant factor can be large too given certain kinds of inputs. What we need for ValueTracking (and also for LVI and maybe others) is a way to cache which variables are (usefully) affected by which assumptions.<br>

<br></blockquote><div><br></div><div>When you say affected, do you mean "what uses"?<br>Because your operand bundles change, does not really do that.</div><div>For a given use I still have to find the assume (which is in the middle of a block), then go to the bundle, then look at the uses of the things in that bundle</div><div>Or am i missing something?<br><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Regarding your suggestions below, I can definitely see how we might run into a problem with the dominance checking if we had a large number of assumptions which affected a variable -- it is only when we have an assumption that potentially affects a variable that we do any of the dominance checking -- but I've not run into any such cases yet.<br></blockquote><div><br></div><div>Note: I have precisely the same problem with newgvn's predicate handling:</div><div><br></div><div>I need to be able to tell, for a given instruction, what predicates dominate it (including assumes)</div><div><br></div><div>It's pretty fast for most predicates, because they occur at the end of blocks, but could still be sped up.</div><div>For assumes, not so much, because they can occur wherever.</div><div><br></div><div>I have a prototype where i've implemented e-ssa to solve both problems. It requires the introduction of one intrinsic, and some use renaming.</div><div><br></div><div>IE</div><div>if (foo == 50) {</div><div>} else { }</div><div><br></div><div>becomes</div><div><br></div><div>if (foo == 50) {</div><div>foo1 = predicateinfo(foo, the icmp of foo )</div><div>(use foo1)</div><div>} else {</div><div>foo2 = predicateinfo(foo, the icmp of foo)</div><div>(use foo2)</div><div>}<br></div><br><div>Once that's done, finding the set of predicates that usefully affect a given use is trivial, as you can just walk the def-use chains.</div><div><br></div><div>I haven't implemented it for assume, but it should work just as well for that.</div><div><br></div><div>I'm somewhat against caching solutions and hacky here,I generally prefer to solve infrastructury problems with better infrastructure.  Caching tends to just kick the can down the road, and often makes it harder to build better infrastructure.  It makes sense when the change is significantly harder to get right, and not worth the cost. Here, honestly, we can do better. </div><div>We have plenty of cases where CVP, etc, will be significantly sped up by the above (i can hand them to you) but not by your change.</div><div><br></div><div>Note also that it improves a large set of passes.</div><div>For example, the variable split enables SCCP to do correlated value propagation (trivially for equality comparison, other things with a bit of work)</div><div><br></div><div>Before, it could only do something if "foo" turned out the same everywhere.</div><div>After, you can propagate  "foo == 50" to the foo1 branch.</div><div><br>Early measurements come out to a few percent more constants found total.</div><div><br></div><div><br></div></div></div></div>