[PATCH] D50474: [LV] Vectorize header phis that feed from if-convertable latch phis

Fri Aug 10 06:33:26 PDT 2018

anna added inline comments.

================
Comment at: lib/Transforms/Utils/LoopUtils.cpp:673
   auto *Previous = dyn_cast<Instruction>(Phi->getIncomingValueForBlock(Latch));
   if (!Previous || !TheLoop->contains(Previous) || isa<PHINode>(Previous) ||
       SinkAfter.count(Previous)) // Cannot rely on dominance due to motion.
----------------
Ayal wrote:
> Looks like one thing being asked here is: why bail out if `isa<PHINode>(Previous)`? @mssimpso answers that in r265983, in response to PR27246. There, `Previous` was a **header** PHINode, (specifically an Induction), which effectively means the original header PHI is a 2nd (or greater) order recurrence. It may be interesting to see if that case could be vectorized correctly.
> In any case, the motivation for this patch involves Previous which isa PHINode, but not of the header block. Such Previous's get vectorized into blends, so presumably this may be safe and helpful:
> 
> ```
>    auto *Previous = dyn_cast<Instruction>(Phi->getIncomingValueForBlock(Latch));
> -  if (!Previous || !TheLoop->contains(Previous) || isa<PHINode>(Previous) ||
> +  if (!Previous || !TheLoop->contains(Previous) ||
> +      (isa<PHINode>(Previous) &&
> +       Previous->getParent() == TheLoop->getHeader()) ||
>        SinkAfter.count(Previous)) // Cannot rely on dominance due to motion.
> ```
> 
> 
> Note however that for 1st order recurrence to work, Previous (including if it's a PHI of a non-header block) must dominate all users of the original header PHI. Or be made to dominate them by Sinking the users After Previous. In particular, Previous cannot be data-dependent on the original header PHI, which will close a dependence cycle.
> 
> In the tests below, %phiuseout2 which is the candidate Previous of %hdrphi1, also depends on it (e.g., via %tmp50), disqualifying it from being a 1st order recurrence. Every such dependence cycle must be broken, if possible, or rather expanded into a cycle of distance VF, as done for Inductions and Reductions, in order for vectorization to succeed. Efforts to handle more cyclic dependencies include D49168 and D22341. Not sure I follow how the tests below with their cyclic dependencies are expected to be vectorized, nor the "over vectorized" argument - perhaps related to D50480(?).
hi Ayal,
Thanks for the comments about the first order recurrence. 

For the test below, as you mentioned, it is not a first order recurrence. However, I think of it as a "predicated reduction". 
The computation I'm interested in vectorizing:
```hdrphi1 = boolarr[i+c] == 1 ? intarr[i+c] + (hdrphi2 == 0 ? hdrphi1 : 0) : (!(hdrphi2 == 0) ? hdrphi1 : -1)```
if boolarr[i+c] is 1 (and hdrphi2 is 0), then this computation reduces to a reduction:
```hdrphi1 = intarr[i+c] +hdrphi1```
if boolarr[i+c] != 1, then hdrphi1 is not dependent on previous iteration (it's either hdrphi1 itself or -1).
if boolarr[i+c] is 1 and hdrphi2 != 0, then
hdrphi1 = intarr[i+c] 

Some forms of predicated reduction is already supported today (when tested on clang), such as:
```if (a[i] > 0)
 x *= a[i];
```
I think it's because we have the identity vector for the else case? i.e. this becomes:
```
if (a[i] > 0)
 x *= a[i];
else
 x *= 1;
```

Repository:
  rL LLVM

https://reviews.llvm.org/D50474