[PATCH] D99750: [LV, VP]VP intrinsics support for the Loop Vectorizer

Mon Oct 2 13:43:10 PDT 2023

fhahn added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8112
   // When not folding the tail, use nullptr to model all-true mask.
-  if (!CM.foldTailByMasking()) {
+  if (!CM.foldTailByMasking() || CM.useVPIVectorization()) {
     BlockMaskCache[Header] = nullptr;
----------------
ABataev wrote:
> fhahn wrote:
> > ABataev wrote:
> > > fhahn wrote:
> > > > ABataev wrote:
> > > > > fhahn wrote:
> > > > > > Better to replace the mask together with introducing EVL to make sure EVL gets added when the mask gets removed?
> > > > > Currently it will require some extra work. We'll need to handle both cases, with activelane instrnsics and direct comparison. Would be possible to keep it for now and fix it once you land emission of activelane intrinsic in VPlan-toVPlan transform?
> > > > With the latest version, can the `useVPWithVPEVLVectorization` part be dropped (if the transform is updated to remove the mask from load/stores)?
> > > Not quite, it will require an extra VPValue, something like VPAllTrueMask, which should replace IV <= BTC. Shall I add it?
> > Would a live in `i1 true` work? I think that may work as is. As EVL is only used for lowering of loads/stores at the moment, it should be only removed there for now?
> You mean scalar i1 true? 
yes, that should be broadcasted across all vector lanes

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99750/new/

https://reviews.llvm.org/D99750