[PATCH] D50579: [LV] Teach about non header phis that have uses outside the loop

Ayal Zaks via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Aug 12 15:03:40 PDT 2018


Ayal added inline comments.


================
Comment at: lib/Transforms/Vectorize/LoopVectorizationLegality.cpp:610
+          // if-convert the phi. May not be needed.
+          if (Phi->getNumIncomingValues() == 2 && Phi->hasNUses(1)) {
+             AllowedExit.insert(Phi);
----------------
Only 2 incoming values: only header phi's are required to have two incoming values; non-header phi's get if-converted for any number of predecessors.

Only 1 use: in general, it shouldn't really matter how many uses there are, nor whether I is a PHINode; if I is defined **inside** the loop, and the loop gets vectorized, it should be possible to obtain the value of I associated with the last iteration, either inside or outside the loop. Regarding cyclic dependencies, these are taken care of by analyzing each header phi, and they in turn take care of their own live-outs, by pre-computing (Inductions) or post-processing (Reductions); so any cross-iteration dependence issues will be detected, and need to be addressed, there (cf. D22341). 

Note that some Instructions may require special attention when obtaining last-lane value, e.g., `isUniformAfterVectorization()`. So starting with non-header PHI's sounds like a good first step.


================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:3723
     if (LCSSAPhi.getNumIncomingValues() == 1) {
-      assert(OrigLoop->isLoopInvariant(LCSSAPhi.getIncomingValue(0)) &&
-             "Incoming value isn't loop invariant");
-      LCSSAPhi.addIncoming(LCSSAPhi.getIncomingValue(0), LoopMiddleBlock);
+      auto *IncomingValue = LCSSAPhi.getIncomingValue(0);
+      if (!Legal->isAllowedNonHeaderExit(IncomingValue)) {
----------------
Would it suffice to do here
```
      Builder.SetInsertPoint(LoopMiddleBlock->getTerminator());
      Value *LastIncomingValue = getOrCreateScalarValue(IncomingValue, {UF-1, VF-1});
      LCSSAPhi.addIncoming(LastIncomingValue, LoopMiddleBlock);
```
instead of the below? Note that in general, if `IncomingValue` `isUniformAfterVectorization()`, we'll need to ask for lane 0 instead of VF-1.


================
Comment at: test/Transforms/LoopVectorize/no_outside_user.ll:19
 ; CHECK-LABEL: @main(
-; CHECK-NOT: <2 x i32>
+; CHECK: <2 x i32>
 
----------------
Good to also check that the live-out is extracted correctly.


================
Comment at: test/Transforms/LoopVectorize/no_outside_user.ll:47
+; CHECK-LABEL: @test2(
+; CHECK: <2 x i32>
+define i32 @test2()  {
----------------
This vectorizes just fine, right?


================
Comment at: test/Transforms/LoopVectorize/no_outside_user.ll:74
+; FIXME: Because of the lcssa creation for tmp17, we will have two uses of tmp17
+; outside the loop. We need to extend the above process to multiple uses.
+define i32 @reduction_sum(i32 %n, i32* noalias nocapture %A, i32* noalias nocapture %B) nounwind uwtable readonly noinline ssp {
----------------
This cannot be vectorized, because reductions are allowed to have only the bump as a single inside user, plus outside users. The header `%sum.02` phi flags this. See `RecurrenceDescriptor::AddReductionVar`


Repository:
  rL LLVM

https://reviews.llvm.org/D50579





More information about the llvm-commits mailing list