[PATCH] D37762: [InstCombine] Remove single use restriction from InstCombine's explicit sinking code.

Daniel Berlin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 13 15:17:04 PDT 2017


dberlin added a comment.

A couple things:

1. A properĀ and complete scalar PRE implementation would eliminate the redundant computation parts in your godbolt example  (but not the redundant stores).

GCC's PRE at -O3 (which is mostly complete), will transform this into a selection of constants and 3 stores, all using those constants.
A proper NewGVN PRE would do the same.

NewGVN right now will catch the full redundancy:
You will get:

  %phiofops = phi i32 [ 131074, %2 ], [ 196611, %6 ]
  %.0 = phi i32 [ 65537, %6 ], [ 0, %2 ]
  ...

%.1 = phi i32 [ %phiofops, %10 ], [ %.0, %7 ]

So it eliminates one of the or's already, even without PRE.

2.

The "proper" theoretical way to eliminate the stores in both (and do the sinking above) is Partial Dead Store/Partial Dead Code elimination.

It should put it in the optimal place as well
There is a PDSE implementation under review (i have someone taking it over).

I would expect PDSE to take care of the first case completely, but as said, it will not sink the computations in your other example, only sink the store.

3. GCC has no PDSE, it has a simple sinking pass that I implemented to catch the common case (on by default)

It's an IR level pass.

What it does is, for an instruction I in block BB, take all the uses of I  (with a phi node use occurring the appropriate predecessor), and finds the nearest common dominator of all the uses.
This place, NCD, is a guaranteed safe location as long as BB dominates it (NCD it may actually be above BB in some loop cases).

then, in the dominator tree between between BB and NCD, we want the block that is the most control dependent and shallowest loop nest. 
So shallower loop nests are always considered better.
Same loop nest level is considered better if execution frequency is significantly lower than NCD execution frequency.
Otherwise, use NCD.

The resulting block is always a safe place to sink (because BB is an ancestor of NCD in the dominator tree, and we are only walking the dominator tree till we hit BB).

You could also use real control dependences to find the thing inside the most non-loop branches :)

In the above case, this sinks it into the pred of the phi, as you want.
In the godbolt example, as mentioned, this is a combination of PRE and PDSE.
The simple sinking pass i wrote is not smart enough to handle the PDSE case you've presented.

I believe LLVM has a similar simple sinking pass.


https://reviews.llvm.org/D37762





More information about the llvm-commits mailing list