[PATCH] D24805: [GVNSink] Initial GVNSink prototype

Thu Sep 22 03:39:24 PDT 2016

jmolloy added a comment.

Hi Matthias,

In https://reviews.llvm.org/D24805#549136, @MatzeB wrote:

> Without looking at anything in the patch. For the example given, wouldn't it be enough to create some DAGCombine style rules:
>
>   x = PHI(OP(y, c), OP(z, c))
>   =>
>   x = OP(PHI(y, z), c)
>   
>
> ?

Adding to what Danny said, the primary issue with taking a greedy approach to sinking is the cost model. We obviously don't want to produce hundreds of PHIs, so we need to only sink instructions when we think the operands are equivalent. Consider this:

  bb1:
    %a = load i32 %b
    %c = add i32 %a, 4
    store %c, i32* %p

  bb2:
     %a2 = load i32 %b
     %c2 = add i32 %a2, 4
     store %c2, i32* %q

If we attacked this in a greedy fashion, we'd compare the two stores and notice that both operands differ. We'd therefore require two extra PHIs to sink that store:

  bb3:
    %p1 = phi %c, %c2
    %p2 = phi %p, %q
    store %p1, i32* %p2

But if we can look more holistically at the entire sequence, we can see that in the three instructions only one operand differs, %p and %q. So we could sink everything with just one additional PHI:

  bb3:
    %p1 = phi %p, %q
    %a = load i32* %b
    %c = add %a, 4
    store %c, i32* %p1

This problem of determining equivalence is why the cost model and analysis is so complex, which makes up around 80% of the pass. We can also do things like inserting new blocks. This explanation is one I created earlier and is lifted out of SimplifyCFG.cpp - it applies here and GVNSInk should do an even better job without the restrictions on one conditional edge:

  // We support two situations:
  //   (1) all incoming arcs are unconditional
  //   (2) one incoming arc is conditional
  //
  // (2) is very common in switch defaults and
  // else-if patterns;
  //
  //   if (a) f(1);
  //   else if (b) f(2);
  //
  // produces:
  //
  //       [if]
  //      /    \
  //    [f(1)] [if]
  //      |     | \
  //      |     |  \
  //      |  [f(2)]|
  //       \    | /
  //        [ end ]
  //
  // [end] has two unconditional predecessor arcs and one conditional. The
  // conditional refers to the implicit empty 'else' arc. This conditional
  // arc can also be caused by an empty default block in a switch.
  //
  // In this case, we attempt to sink code from all *unconditional* arcs.
  // If we can sink instructions from these arcs (determined during the scan
  // phase below) we insert a common successor for all unconditional arcs and
  // connect that to [end], to enable sinking:
  //
  //       [if]
  //      /    \
  //    [x(1)] [if]
  //      |     | \
  //      |     |  \
  //      |  [x(2)] |
  //       \   /    |
  //   [sink.split] |
  //         \     /
  //         [ end ]
  //

The ability to sink from a subset of the predecessors is what allows this optimization to remove large amounts of code. We canonicalize to a single %cleanup block at the end of a function, so there is frequently one block with *many* predecessors from different parts of the function - some of these predecessors will have common code to sink and others won't. It's important to be able to determine when there is enough code sinkable from a subset of predecessors to make it worthwhile to create a new block.

James

Repository:
  rL LLVM

https://reviews.llvm.org/D24805