[llvm-dev] [RFC] Vector Predication
Simon Moll via llvm-dev
llvm-dev at lists.llvm.org
Tue Feb 5 00:11:53 PST 2019
On 2/5/19 1:27 AM, Philip Reames via llvm-dev wrote:
> On 1/31/19 4:57 PM, Bruce Hoult wrote:
>> On Thu, Jan 31, 2019 at 4:05 PM Philip Reames via llvm-dev
>> <llvm-dev at lists.llvm.org> wrote:
>>> Do such architectures frequently have arithmetic operations on the
>>> mask registers? (i.e. can I reasonable compute a conservative
>>> length given a mask register value) If I can, then having a mask as
>>> the canonical form and re-deriving the length register from a mask
>>> for a sequence of instructions which share a predicate seems fairly
>>> reasonable. Note that I'm assuming this as a fallback, and that the
>>> common case is handled via the equivalent of ComputeKnownBits on the
>>> mask itself at compile time.
>> If masking is used (which it is usually not for loops without control
>> flow inside the vectorised loop) then, yes, logical operations on the
>> mask registers will happen at every basic block boundary.
>> But it is NOT the case that you can computer the active vector length
>> VL from an initial mask value. The active vector length is set by the
>> hardware based on the remaining application vector length. The VL can
>> change for each loop iteration -- the normal pattern is for VL to
>> equal VLMAX for initial executions of the loop, and then be less than
>> VLMAX for the final one or two iterations of the loop. For example if
>> VLMAX is 16 and there are 19 elements left in the application vector
>> then the hardware might choose to use 10 elements for the 2nd to last
>> iteration and 9 elements for the last iteration. Or not. Other
>> hardware might choose to perform the last three iterations as 12/12/11
>> instead of 16/10/9. (It is constrained to be monotonic).
>> VL can also be dynamically shortened in the middle of a loop iteration
>> by an unaligned vector load that crosses a protection boundary if the
>> later elements are inaccessible.
> I can't reconcile this complexity with either the snippet on RISV
> which was shared, or the current EVL proposal. Doesn't this imply
> that the vector length can change between *every* pair of vector
> instructions? If so, how does having it as part of the EVL intrinsics
I think this is the usual mixup of AVL and MVL.
AVL: is part of the predicate and can change between vector operations
just like a mask can (light weight).
MVL: Is the physical vector register length and can be re-configured per
function (RVV only atm) - (heavy weight, stop-the-world instruction).
The vectorlen parameter in EVL intrinsics is for the AVL.
>> I'm curious what SVE will do if there is an if/then/else in the middle
>> of a vectorised loop with a shorter-than-maximum vector length. You
>> can't just invert the mask when going from the then-part to the
>> else-part because that would re-enable elements past the end of the
>> vector. You'd need to invert the mask and then AND it with the mask
>> containing the (bitwise representation of) the vector length.
I folks have issues with carrying the vlen around even if the target
only supports masking, we can rephrase EVL using higher-order functions
with varargs (basically prefixing):
ARM SVE, AVX512 (mask only targets):
llvm.evl.masked(<16 x i1> mask %M, ...)
llvm.evl.fsub(<16 x float>, <16 x float>) ; exists only to get a
call @llvm.evl.masked.v16f32(%M, @llvm.evl.fsub(v16f32, <16 x
float>, <16 x float>)
RISC-V V, SX-Aurora:
llvm.evl.pred(<16 x i1> mask %M, i32 vlen %VL, ...)
llvm.evl.pred(%M, %vl, @llvm.evl.fsub, %a, %b)
The problem with this is mostly that the operand positions are now off
compared to regular IR and the API abstractions that accept both will
have to account for that.
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
Researcher / PhD Student
Compiler Design Lab (Prof. Hack)
Saarland University, Computer Science
Building E1.3, Room 4.31
Tel. +49 (0)681 302-57521 : moll at cs.uni-saarland.de
Fax. +49 (0)681 302-3065 : http://compilers.cs.uni-saarland.de/people/moll
More information about the llvm-dev