[PATCH] D39340: Modifying reassociate for improved CSE

Thu Nov 2 09:59:13 PDT 2017

NAry was written fairly recently (past 2 years) by jingyue (cc'd), and
there are mailing list discussions about it, as well as what should happen
between reassociate and naryreassociate (and there are discussions in the
reviews as well)

Note also that eventually i will push NewGVN to do reassociation for CSE
purposes, because it has all the data anyway to do a good job of it.

(It will not subsume the balancing/etc we attempt to do, but it should
subsume all the redundancy elimination related reordering)

Of course, given the role i just took on, it's unlikely to happen for a
year, so it shouldn't block anything, but it's on the future plans list :)

On Thu, Nov 2, 2017 at 9:44 AM, escha via Phabricator <
reviews at reviews.llvm.org> wrote:

> escha added a comment.
>
> In https://reviews.llvm.org/D39340#913756, @fhahn wrote:
>
> > In https://reviews.llvm.org/D39340#910844, @escha wrote:
> >
> > > Hmm, reading through N-ary reassociate, it looks like it has similar
> goals but is not completely overlapping in action, for better or worse.
> > >
> > > N-ary reassociate runs in dominator order and looks to see if any
> subexpressions of the current computation have already been computed.
> > >
> > > But as far as I understand, it does not consider *reassociating
> expressions* to find those redundancies. In other words, it doesn't
> consider the N^2 possible sub-combinations of previous expressions; it only
> considers them in the order they were emitted. Its main goal seems to be
> optimizing addressing expressions. So for example, if you have (a+b) and a
> later expression computes (a+(b+2)), it can find that this is redundant.
> But if you have (a+(b+1)) and (a+(b+2)) but no (a+b), I don't think it
> handles that. It also only runs in dominator order, so it can't find
> arbitrary redundancies.
> > >
> > > Similarly, my change doesn't handle GetElementPtr in the way N-ary
> reassociate does, since it wasn't targeted at addressing expressions.
> > >
> > > So I agree they're redundant, and kind of share similar goals and
> purposes, but annoyingly I don't see any way to unify them and achieve the
> goals of both.
> >
> >
> > Thanks. I suppose it should be possible to to extend Reassociate to
> support GetElementPtr, and then we would get wha NaryReassociate provides
> with your patch? I am not saying this should be part of this patch, I am
> just trying to understand how we can build on this patch.
>
>
> I'm not sure. GetElementPtr is a fairly complex instruction that isn't
> easily mapped to other instructions (i.e. it can represent a whole chain
> instead of just one). I think the reason that Nary is able to do this is
> because it uses SCEV, which abstracts over that.
>
> Who wrote Nary; maybe they might have a useful opinion on all this?
>
> > Your patch uses function range counts for pairs, right? Could this lead
> to cases were this change reassociates operations based on this global
> count, even though CSE won't eliminate the reassociated expression (because
> it would be too costly because of control flow for example) and produces
> worse results? I assume this is not an issue for shader code, but could
> happen for control-flow heavy general code. I can run some benchmarks on
> AArch64 and share results in a couple of days.
>
> Yeah, I was thinking of that. In the most extreme case, you could end up
> with disjoint instructions in different control flow paths, but on the
> other hand, the new GVN-Hoist should ideally catch those. You could use
> something like a scoped data structure to handle this in theory, maybe?
>
>
>
> ================
> Comment at: lib/Transforms/Scalar/Reassociate.cpp:2212
> +        if (Ops[i].Op != Ops[j].Op) {
> +          auto it2 = PairMap[Idx].find({Ops[j].Op, Ops[i].Op});
> +          if (it2 != PairMap[Idx].end())
> ----------------
> fhahn wrote:
> > Could we order the keys in the pairs to avoid a second lookup here, i.e.
> insert pairs with <a, b> where a < b?
> hmm. what would be the best way to do this? It's possible for two
> instructions to have the same Rank, iirc. would pointer order be
> acceptable? I'd be worried about determinism.
>
>
> ================
> Comment at: lib/Transforms/Scalar/Reassociate.cpp:2251
>    // Traverse the same blocks that was analysed by BuildRankMap.
> -  for (BasicBlock *BI : RPOT) {
> -    assert(RankMap.count(&*BI) && "BB should be ranked.");
> -    // Optimize every instruction in the basic block.
> -    for (BasicBlock::iterator II = BI->begin(), IE = BI->end(); II != IE;)
> -      if (isInstructionTriviallyDead(&*II)) {
> -        EraseInst(&*II++);
> -      } else {
> -        OptimizeInst(&*II);
> -        assert(II->getParent() == &*BI && "Moved to a different block!");
> -        ++II;
> +  for (int q = 0; q < 2; ++q) {
> +    for (Pass = 0; Pass < 2; ++Pass) {
> ----------------
> fhahn wrote:
> > It looks like we are running the reassociate 4 times? Is this needed?
> From the description it sounds like 2 times should be enough?
> oh, right. this was me experimenting to see if i ran it twice I'd get
> better results .... and I did! I'm guessing this is because if you heavily
> reassociate based on global statistics, it opens up new opportunities, and
> so on. But it might be unnecessary.
>
> this is probably in part caused by the fact that my patch isn't worklist
> based (i.e. run over it repeatedly instead of add things to a worklist to
> revisit)
>
>
> ================
> Comment at: lib/Transforms/Scalar/Reassociate.cpp:2293
>
> -    // Make a copy of all the instructions to be redone so we can remove
> dead
> -    // instructions.
> -    SetVector<AssertingVH<Instruction>> ToRedo(RedoInsts);
> -    // Iterate over all instructions to be reevaluated and remove
> trivially dead
> -    // instructions. If any operand of the trivially dead instruction
> becomes
> -    // dead mark it for deletion as well. Continue this process until all
> -    // trivially dead instructions have been removed.
> -    while (!ToRedo.empty()) {
> -      Instruction *I = ToRedo.pop_back_val();
> -      if (isInstructionTriviallyDead(I)) {
> -        RecursivelyEraseDeadInsts(I, ToRedo);
> -        MadeChange = true;
> +      for (BasicBlock *BI : RPOT) {
> +        for (Instruction &I : *BI) {
> ----------------
> fhahn wrote:
> > It's probably easier to understand if this would be done in a separate
> function.
> good point. agreed
>
>
> Repository:
>   rL LLVM
>
> https://reviews.llvm.org/D39340
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20171102/3924ba8d/attachment.html>