[PATCH] D126363: [VPlan, VP] 1/4 Introduce new recipes to support predicated vectorization

Wed May 25 08:26:14 PDT 2022

ABataev added a comment.

In D126363#3537348 <https://reviews.llvm.org/D126363#3537348>, @simoll wrote:

> In D126363#3537242 <https://reviews.llvm.org/D126363#3537242>, @ABataev wrote:
>
>> In D126363#3537197 <https://reviews.llvm.org/D126363#3537197>, @simoll wrote:
>>
>>> In D126363#3536686 <https://reviews.llvm.org/D126363#3536686>, @ABataev wrote:
>>>
>>>> Hi Simon, did you think about making EVL a member of VPlan just like TripCount? In this case we might be not needed lots of these new classes.
>>>
>>> Hi Alexey! The EVL behaves like a mask and less like the TripCount. When used for tail predication, the value of EVL still depends on the current vector iteration and needs to be computed in the vector loop.
>>
>> But you can also treat it as an effective vector factor and use it similarly to VectorTripCount. Introducing new nodes just to add an extra operand EVL does not look necessary
>
> I was looking at the comments we got earlier for the reference implementation. In particular @hahn 's comment on the EVL being loop-invariant when it's not used for tail predication <https://reviews.llvm.org/D99750#inline-967117>.
> The thing is, when EVL **is** used for tail predication you need to re-compute it in every vector loop iteration. I don't see how EVL could be handled like VectorTripCount in this case. Could you elaborate?

I have this scheme in mind:

  vector.body:
    %canon.iv = phi int
    %evl = evl(%canon.iv, %vector.trip.count)
    ...
    br ...

We store the the %evl as the related value for VPValue *EVL using State.set(EVL, %evl, Part) and then get required value using State.get(EVL, Part).
In this case we can treat EVL similarly to canonical iv, which is not an invariant.

>>> Also - depending on you target - setting EVL is relatively light weight and instructions in the vector loop may have different EVLs in the future.
>>
>> In what case this can happen? I believe for unrolled loop? But it can be handled by the VPTransformState::Part and VPTransformState::set/get functions.
>
> For unrolling/interleaving, sure. I was thinking of optimizations that compact a mask and use `EVL == number-of-ones-in-the-mask` to densely operate on the compressed vectors - nothing we need to concern us with for the time being.

Could elaborate how it may affect EVL value? Can we have different EVLs in the same loop? Or it is not an EVL, but some kind of transformation of the original EVL value?

> Here is my suggestion:
>
> 1. We get an explicit-vector-length recipe to compute EVL inside the vector loop. And this will be the only recipe we add because..
> 2. We extend the existing recipes with an (optional) EVL operand. Presence of EVL implies that VP intrinsics are used for widening.

I'm afraid that it will require HUGE(!!!) amount of changes in the Vplan. I assume, there still will be the same recipe/vpvalue for EVL across of all recipes/vpinstructions.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D126363/new/

https://reviews.llvm.org/D126363