[llvm-dev] [RFC] Vector Predication

Tue Feb 5 00:11:53 PST 2019

On 2/5/19 1:27 AM, Philip Reames via llvm-dev wrote:
>
> On 1/31/19 4:57 PM, Bruce Hoult wrote:
>> On Thu, Jan 31, 2019 at 4:05 PM Philip Reames via llvm-dev
>> <llvm-dev at lists.llvm.org> wrote:
>>> Do such architectures frequently have arithmetic operations on the 
>>> mask registers?  (i.e. can I reasonable compute a conservative 
>>> length given a mask register value)  If I can, then having a mask as 
>>> the canonical form and re-deriving the length register from a mask 
>>> for a sequence of instructions which share a predicate seems fairly 
>>> reasonable. Note that I'm assuming this as a fallback, and that the 
>>> common case is handled via the equivalent of ComputeKnownBits on the 
>>> mask itself at compile time.
>> If masking is used (which it is usually not for loops without control
>> flow inside the vectorised loop) then, yes, logical operations on the
>> mask registers will happen at every basic block boundary.
>>
>> But it is NOT the case that you can computer the active vector length
>> VL from an initial mask value. The active vector length is set by the
>> hardware based on the remaining application vector length. The VL can
>> change for each loop iteration -- the normal pattern is for VL to
>> equal VLMAX for initial executions of the loop, and then be less than
>> VLMAX for the final one or two iterations of the loop. For example if
>> VLMAX is 16 and there are 19 elements left in the application vector
>> then the hardware might choose to use 10 elements for the 2nd to last
>> iteration and 9 elements for the last iteration. Or not. Other
>> hardware might choose to perform the last three iterations as 12/12/11
>> instead of 16/10/9. (It is constrained to be monotonic).
>>
>> VL can also be dynamically shortened in the middle of a loop iteration
>> by an unaligned vector load that crosses a protection boundary if the
>> later elements are inaccessible.
> I can't reconcile this complexity with either the snippet on RISV 
> which was shared, or the current EVL proposal.  Doesn't this imply 
> that the vector length can change between *every* pair of vector 
> instructions?  If so, how does having it as part of the EVL intrinsics 
> work?

I think this is the usual mixup of AVL and MVL.

AVL: is part of the predicate and can change between vector operations 
just like a mask can (light weight).

MVL: Is the physical vector register length and can be re-configured per 
function (RVV only atm) - (heavy weight, stop-the-world instruction).

The vectorlen parameter in EVL intrinsics is for the AVL.

>>
>> I'm curious what SVE will do if there is an if/then/else in the middle
>> of a vectorised loop with a shorter-than-maximum vector length. You
>> can't just invert the mask when going from the then-part to the
>> else-part because that would re-enable elements past the end of the
>> vector. You'd need to invert the mask and then AND it with the mask
>> containing the (bitwise representation of) the vector length.

I folks have issues with carrying the vlen around even if the target 
only supports masking, we can rephrase EVL using higher-order functions 
with varargs (basically prefixing):

ARM SVE, AVX512 (mask only targets):

    llvm.evl.masked(<16 x i1> mask %M, ...)

     llvm.evl.fsub(<16 x float>, <16 x float>)  ; exists only to get a 
function handls

     call @llvm.evl.masked.v16f32(%M, @llvm.evl.fsub(v16f32, <16 x 
float>, <16 x float>)

RISC-V V, SX-Aurora:

     llvm.evl.pred(<16 x i1> mask %M, i32 vlen %VL, ...)

     llvm.evl.pred(%M, %vl, @llvm.evl.fsub, %a, %b)

The problem with this is mostly that the operand positions are now off 
compared to regular IR and the API abstractions that accept both will 
have to account for that.

- Simon

> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 

Simon Moll
Researcher / PhD Student

Compiler Design Lab (Prof. Hack)
Saarland University, Computer Science
Building E1.3, Room 4.31

Tel. +49 (0)681 302-57521 : moll at cs.uni-saarland.de
Fax. +49 (0)681 302-3065  : http://compilers.cs.uni-saarland.de/people/moll