[PATCH] D125301: [LoopVectorize] Add option to use active lane mask for loop control flow

David Sherwood via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jun 20 00:56:07 PDT 2022


david-arm added inline comments.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8712
+
+    auto *ALM = new VPInstruction(VPInstruction::ActiveLaneMask,
+                                  {CanonicalIVIncrementParts, TC}, DL);
----------------
fhahn wrote:
> david-arm wrote:
> > fhahn wrote:
> > > david-arm wrote:
> > > > fhahn wrote:
> > > > > It looks like the update for the phi doesn't depend on the phi, but only on trip count & main induction. 
> > > > > 
> > > > > I might be missing something, but is the phi actually needed or would it be possible to compute the active lane mask at the beginning of each iteration instead of at the end of the iteration?
> > > > Hi @fhahn, perhaps I've misunderstood your question here, but actually the point of this patch is to do the exact opposite of that, i.e. we *don't* want to generate the active lane mask for the current iteration - we want to create the mask for the next iteration. This requires a PHI to carry the live value around the loop and is the only way to use the mask for control flow because at the point of branching we want to know if there are any active lanes in the mask for the *next* iteration.
> > > > 
> > > > With particular reference to SVE, the motivation for this work is to use the 'whilelo' instruction to both generate the lane mask and set the flags to branch on. Effectively, the whilelo instruction is doing the comparison already, which makes the traditional scalar IV comparison redundant.
> > > > 
> > > > I could be wrong, but I believe this form of vectorised loop would be beneficial for some other targets with a predicated instruction set too, such as RISC-V.
> > > > With particular reference to SVE, the motivation for this work is to use the 'whilelo' instruction to both generate the lane mask and set the flags to branch on. 
> > > 
> > > Oh right, I missed this in the test case I was looking at! I originally thought the intention of the new phi recipe was to encode some extra guarantees/information like we do for inductions or reductions. 
> > > 
> > > From the latest comment, it sounds like there would be no need to have a new recipe class for the phi, but perhaps VPWidenPHIRecipe could be used instead, if the setup can be moved to the pre-header.
> > So I actually tried doing exactly this initially, but I think that VPWidenPHIRecipe requires an underlying scalar instruction to widen due to the execute function that lives in LoopVectorize.cpp:
> > 
> >   void VPWidenPHIRecipe::execute(VPTransformState &State) {
> >     State.ILV->widenPHIInstruction(cast<PHINode>(getUnderlyingValue()), this,
> >                                    State);
> >   }
> > 
> > but the PHI node does not exist in the original scalar loop. It's not currently possible to widen a PHI that didn't previously exist, which means I would have to modify VPWidenPHIRecipe to test for the existence of an underlying value and take different paths accordingly.
> > So I actually tried doing exactly this initially, but I think that VPWidenPHIRecipe requires an underlying scalar instruction to widen due to the execute function that lives in LoopVectorize.cpp:
> 
> 
> Yeah this was a bit unfortunate! I landed 2 patches that removed the unnecessary dependence on the underlying instruction by instead using the type of its operand.
OK well thanks a lot for tidying that class up! The only problem is that even after your patches landed VPWidenPhiRecipe::execute only ever deals with Part 0, whereas we need a Phi for every Part. So unfortunately I still can't use the class in it's current state. If you have suggestions about how to fix this I'm happy to take another look? Perhaps I can add a boolean flag to the VPWidenPhiRecipe constructor to indicate how many parts to generate?


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:764
+    BranchOnCond,
+    BranchOnActiveLaneMask,
   };
----------------
fhahn wrote:
> david-arm wrote:
> > david-arm wrote:
> > > fhahn wrote:
> > > > Can `BranchOnCond` be used instead of the dedicated `BranchOnactiveLaneMask`?
> > > I created this patch a month ago, which predated your BranchOnCond work. That's why I haven't used it. I can certainly look into this and see if it's possible though?
> > So I did look into this. In order to do it this way I have to explicitly generate the Not and ExtractElement operations using VPInstructions, which requires a new VPInstruction::ExtractElement type. It's possible to do this, but then I wasn't sure about the semantics of this new instruction. When passing in a scalar constant of 0 for the lane, it gets widened to something like <vscale x 4 x i32> zeroinitializer for every part. However, I only need a single lane so I'd have to do something like:
> > 
> >   case VPInstruction::ExtractElement: {
> >     Value *Vec = State.get(getOperand(0), Part);
> >     Value *Lane = State.get(getOperand(1), VPIteration(0, 0));
> >     Value *V = Builder.CreateExtractElement(Vec, Lane);
> >     State.set(this, V, Part);
> >     break;
> >   }
> > 
> > It feels quite inefficient to go to all the effort of widening, only to discard everything!
> > 
> > If you still prefer me to proceed with this approach I'm happy to try if you can provide your thoughts on what the new ExtractElement operation should look like?
> > So I did look into this. In order to do it this way I have to explicitly generate the Not and ExtractElement operations using VPInstructions, which requires a new VPInstruction::ExtractElement type. 
> 
> Extracts are not modeled explicitly at the moment and usually `State.get` will take care of interesting an extract when requesting scalar lanes if it is needed. I *think* when using a `VPInstruction::not` as operand for `BranchOnCond`, `State.get` should insert the extract for the first lane, as this is what `BranchOnCond` uses.
OK I'll take another look and give this a try!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125301/new/

https://reviews.llvm.org/D125301



More information about the llvm-commits mailing list