[llvm-dev] [RFC] Vector Predication

Mon Feb 4 16:27:33 PST 2019

On 1/31/19 4:57 PM, Bruce Hoult wrote:
> On Thu, Jan 31, 2019 at 4:05 PM Philip Reames via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>> Do such architectures frequently have arithmetic operations on the mask registers?  (i.e. can I reasonable compute a conservative length given a mask register value)  If I can, then having a mask as the canonical form and re-deriving the length register from a mask for a sequence of instructions which share a predicate seems fairly reasonable.  Note that I'm assuming this as a fallback, and that the common case is handled via the equivalent of ComputeKnownBits on the mask itself at compile time.
> If masking is used (which it is usually not for loops without control
> flow inside the vectorised loop) then, yes, logical operations on the
> mask registers will happen at every basic block boundary.
>
> But it is NOT the case that you can computer the active vector length
> VL from an initial mask value. The active vector length is set by the
> hardware based on the remaining application vector length. The VL can
> change for each loop iteration -- the normal pattern is for VL to
> equal VLMAX for initial executions of the loop, and then be less than
> VLMAX for the final one or two iterations of the loop. For example if
> VLMAX is 16 and there are 19 elements left in the application vector
> then the hardware might choose to use 10 elements for the 2nd to last
> iteration and 9 elements for the last iteration. Or not. Other
> hardware might choose to perform the last three iterations as 12/12/11
> instead of 16/10/9. (It is constrained to be monotonic).
>
> VL can also be dynamically shortened in the middle of a loop iteration
> by an unaligned vector load that crosses a protection boundary if the
> later elements are inaccessible.
I can't reconcile this complexity with either the snippet on RISV which 
was shared, or the current EVL proposal.  Doesn't this imply that the 
vector length can change between *every* pair of vector instructions?  
If so, how does having it as part of the EVL intrinsics work?
>
> I'm curious what SVE will do if there is an if/then/else in the middle
> of a vectorised loop with a shorter-than-maximum vector length. You
> can't just invert the mask when going from the then-part to the
> else-part because that would re-enable elements past the end of the
> vector. You'd need to invert the mask and then AND it with the mask
> containing the (bitwise representation of) the vector length.