[PATCH] D125301: [LoopVectorize] Add option to use active lane mask for loop control flow

Florian Hahn via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Jun 19 14:21:19 PDT 2022


fhahn added inline comments.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:764
+    BranchOnCond,
+    BranchOnActiveLaneMask,
   };
----------------
david-arm wrote:
> david-arm wrote:
> > fhahn wrote:
> > > Can `BranchOnCond` be used instead of the dedicated `BranchOnactiveLaneMask`?
> > I created this patch a month ago, which predated your BranchOnCond work. That's why I haven't used it. I can certainly look into this and see if it's possible though?
> So I did look into this. In order to do it this way I have to explicitly generate the Not and ExtractElement operations using VPInstructions, which requires a new VPInstruction::ExtractElement type. It's possible to do this, but then I wasn't sure about the semantics of this new instruction. When passing in a scalar constant of 0 for the lane, it gets widened to something like <vscale x 4 x i32> zeroinitializer for every part. However, I only need a single lane so I'd have to do something like:
> 
>   case VPInstruction::ExtractElement: {
>     Value *Vec = State.get(getOperand(0), Part);
>     Value *Lane = State.get(getOperand(1), VPIteration(0, 0));
>     Value *V = Builder.CreateExtractElement(Vec, Lane);
>     State.set(this, V, Part);
>     break;
>   }
> 
> It feels quite inefficient to go to all the effort of widening, only to discard everything!
> 
> If you still prefer me to proceed with this approach I'm happy to try if you can provide your thoughts on what the new ExtractElement operation should look like?
> So I did look into this. In order to do it this way I have to explicitly generate the Not and ExtractElement operations using VPInstructions, which requires a new VPInstruction::ExtractElement type. 

Extracts are not modeled explicitly at the moment and usually `State.get` will take care of interesting an extract when requesting scalar lanes if it is needed. I *think* when using a `VPInstruction::not` as operand for `BranchOnCond`, `State.get` should insert the extract for the first lane, as this is what `BranchOnCond` uses.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125301/new/

https://reviews.llvm.org/D125301



More information about the llvm-commits mailing list