[llvm-dev] [RFC] Vector Predication

Thu Jan 31 11:17:18 PST 2019

On 1/31/19 11:03 AM, David Greene wrote:
> Philip Reames <listmail at philipreames.com> writes:
>
>> Question 1 - Why do we need separate mask and lengths? Can't the
>> length be easily folded into the mask operand?
>>
>> e.g. newmask = (<4 x i1>)((i4)%y & (1 << %L -1))
>> and then pattern matched in the backend if needed
> I'm a little concerned about how difficult it will be to maintain enough
> information throughout compilation to be able to match this on a machine
> with an explicit vector length value.
Does the hardware *also* have a mask register?  If so, this is a likely 
minor code quality issue which can be incrementally refined on.  If it 
doesn't, then I can see your concern.
>
>> Question 2 - Have you explored using selects instead? What practical
>> problems do you run into which make you believe explicit predication
>> is required?
>>
>> e.g. %sub = fsub <4 x float> %x, %y
>> %result = select <4 x i1> %M, <4 x float> %sub, undef
> That is semantically incorrect.  According to IR semantics, the fsub is
> fully evaluated before the select comes along.  It could trap for
> elements where %M is 0, whereas a masked intrinsic conveys the proper
> semantics of masking traps for masked-out elements.  We need intrinsics
> and eventually (IMHO) fully first-class predication to make this work
> properly.

If you want specific trap behavior, you need to use the constrained 
family of intrinsics instead.  In IR, fsub is expected not to trap.

We have an existing solution for modeling FP environment aspects such as 
rounding and trapping.  The proposed signatures for your EVL proposal do 
not appear to subsume those, and you've not proposed their retirement.  
We definitely don't want *two* ways of describing FP trapping.

In other words, I don't find this reason compelling since my example can 
simply be rewritten using the appropriate constrained intrinsic.

>
>> My context for these questions is that my experience recently w/o
>> existing masked intrinsics shows us missing fairly basic
>> optimizations, precisely because they weren't able to reuse all of the
>> existing infrastructure. (I've been working on
>> SimplifyDemandedVectorElts recently for exactly this reason.) My
>> concern is that your EVL proposal will end up in the same state.
> I think that's just the nature of the beast.  We need IR-level support
> for masking and we have to teach LLVM about it.
I'm solidly of the opinion that we already *have* IR support for 
explicit masking in the form of gather/scatter/etc...  Until someone has 
taken the effort to make masking in this context *actually work well*, 
I'm unconvinced that we should greatly expand the usage in the IR.
>
>                             -David