[PATCH] D99750: [LV, VP]VP intrinsics support for the Loop Vectorizer

Alexey Bataev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 2 13:47:59 PDT 2023


ABataev added inline comments.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8112
   // When not folding the tail, use nullptr to model all-true mask.
-  if (!CM.foldTailByMasking()) {
+  if (!CM.foldTailByMasking() || CM.useVPIVectorization()) {
     BlockMaskCache[Header] = nullptr;
----------------
fhahn wrote:
> ABataev wrote:
> > fhahn wrote:
> > > ABataev wrote:
> > > > fhahn wrote:
> > > > > ABataev wrote:
> > > > > > fhahn wrote:
> > > > > > > Better to replace the mask together with introducing EVL to make sure EVL gets added when the mask gets removed?
> > > > > > Currently it will require some extra work. We'll need to handle both cases, with activelane instrnsics and direct comparison. Would be possible to keep it for now and fix it once you land emission of activelane intrinsic in VPlan-toVPlan transform?
> > > > > With the latest version, can the `useVPWithVPEVLVectorization` part be dropped (if the transform is updated to remove the mask from load/stores)?
> > > > Not quite, it will require an extra VPValue, something like VPAllTrueMask, which should replace IV <= BTC. Shall I add it?
> > > Would a live in `i1 true` work? I think that may work as is. As EVL is only used for lowering of loads/stores at the moment, it should be only removed there for now?
> > You mean scalar i1 true? 
> yes, that should be broadcasted across all vector lanes
I don't like that we won't match actual type here. I thought about other possible solution - overload VPActiaveLaneMask and make it return all-true mask for RVL targets. WDYT?


================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:832
+      continue;
+    auto *NewInst =
+        new VPInstruction(VPInstruction::ExplicitVectorLengthIVIncrement,
----------------
fhahn wrote:
> ABataev wrote:
> > fhahn wrote:
> > > ABataev wrote:
> > > > fhahn wrote:
> > > > > ABataev wrote:
> > > > > > fhahn wrote:
> > > > > > > I think turning the step of the canonical induction non-loop-invariant technically turns the canonical IV into a phi that's not a canonical IV any more (which is guaranteed to step the same amount each iteration). Would it work to keep the increment unchanged and keep rounding up the trip count was with regular tail folding initially? Further down the line, the canonical IV issue may be resolved by also replacing the canonical IV node with a regular scalar phi when doing the replacement here.
> > > > > > I'll try to improve this.
> > > > > Did you get a chance to try this out yet? 
> > > > > 
> > > > > 97687b7aea17 landed, it would probably be good to also remove the header mask from load/store recipes here, to make clear that this optimizes the tail-folded loop?
> > > > Already did. The loop is countable, adding a new phi won't give anything, just some extra work without any effect.
> > > Oh right I missed that, sorry! 
> > > 
> > > Does the latest version actually have to update the canonical IV increment? 
> > > 
> > > I might be missing something, but shouldn't the exit condition now use the rounded up version (a multiple of the VF) of the trip count for the compare, so if we increment by EVL then the we might not reach the exit condition?
> > There are 2 increments now: 2 than feeds into PHI and another one used for exit condition. It uses rounded version of trip count for comparison.
> Right, but why both are needed? Can there be more than 1 iteration where EVL < VF?
The first one is used to increment IV, another one - to check for number of iterations. We need the first one (which is vsetvli dependable) to correctly count IV.
But looks like I need to adjust the IV here, because otherwise we may have wrong comparison. I'll think more about it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99750/new/

https://reviews.llvm.org/D99750



More information about the llvm-commits mailing list