[llvm-dev] [RFC] Vector Predication
Bruce Hoult via llvm-dev
llvm-dev at lists.llvm.org
Thu Jan 31 16:57:14 PST 2019
On Thu, Jan 31, 2019 at 4:05 PM Philip Reames via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> Do such architectures frequently have arithmetic operations on the mask registers? (i.e. can I reasonable compute a conservative length given a mask register value) If I can, then having a mask as the canonical form and re-deriving the length register from a mask for a sequence of instructions which share a predicate seems fairly reasonable. Note that I'm assuming this as a fallback, and that the common case is handled via the equivalent of ComputeKnownBits on the mask itself at compile time.
If masking is used (which it is usually not for loops without control
flow inside the vectorised loop) then, yes, logical operations on the
mask registers will happen at every basic block boundary.
But it is NOT the case that you can computer the active vector length
VL from an initial mask value. The active vector length is set by the
hardware based on the remaining application vector length. The VL can
change for each loop iteration -- the normal pattern is for VL to
equal VLMAX for initial executions of the loop, and then be less than
VLMAX for the final one or two iterations of the loop. For example if
VLMAX is 16 and there are 19 elements left in the application vector
then the hardware might choose to use 10 elements for the 2nd to last
iteration and 9 elements for the last iteration. Or not. Other
hardware might choose to perform the last three iterations as 12/12/11
instead of 16/10/9. (It is constrained to be monotonic).
VL can also be dynamically shortened in the middle of a loop iteration
by an unaligned vector load that crosses a protection boundary if the
later elements are inaccessible.
I'm curious what SVE will do if there is an if/then/else in the middle
of a vectorised loop with a shorter-than-maximum vector length. You
can't just invert the mask when going from the then-part to the
else-part because that would re-enable elements past the end of the
vector. You'd need to invert the mask and then AND it with the mask
containing the (bitwise representation of) the vector length.
More information about the llvm-dev