[llvm-dev] [RFC] Vector Predication
Kaylor, Andrew via llvm-dev
llvm-dev at lists.llvm.org
Thu Jan 31 11:18:28 PST 2019
>> Question 2 - Have you explored using selects instead? What practical
>> problems do you run into which make you believe explicit predication
>> is required?
>> e.g. %sub = fsub <4 x float> %x, %y
>> %result = select <4 x i1> %M, <4 x float> %sub, undef
> That is semantically incorrect. According to IR semantics, the fsub is fully
> evaluated before the select comes along. It could trap for elements where
> %M is 0, whereas a masked intrinsic conveys the proper semantics of
> masking traps for masked-out elements. We need intrinsics and eventually
> (IMHO) fully first-class predication to make this work properly.
The LLVM language reference says, "The default LLVM floating-point environment assumes that floating-point instructions do not have side effects." So that's why this situation has been tolerated. As you're probably aware have work in progress to enable a mode where the default FP environment is not assumed and we properly handle FP status flags and exception unmasking. This will absolutely require masked versions of the operations.
I personally like the idea of having masked operations in the general case as opposed to using selects and hoping the selection DAG will pick the right instructions, because it doesn't always work out that way. But I suppose that needs to be weighed against whatever optimization opportunities are missed because of the less general representation. I agree that we should be able to mitigate this by teaching the optimizer to handle masked operations.
From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of David Greene via llvm-dev
Sent: Thursday, January 31, 2019 11:04 AM
To: Philip Reames <listmail at philipreames.com>
Cc: via llvm-dev <llvm-dev at lists.llvm.org>; Saito, Hideki <hideki.saito at intel.com>; Topper, Craig <craig.topper at intel.com>; Maslov, Sergey V <sergey.v.maslov at intel.com>
Subject: Re: [llvm-dev] [RFC] Vector Predication
Philip Reames <listmail at philipreames.com> writes:
> Question 1 - Why do we need separate mask and lengths? Can't the
> length be easily folded into the mask operand?
> e.g. newmask = (<4 x i1>)((i4)%y & (1 << %L -1)) and then pattern
> matched in the backend if needed
I'm a little concerned about how difficult it will be to maintain enough information throughout compilation to be able to match this on a machine with an explicit vector length value.
> Question 2 - Have you explored using selects instead? What practical
> problems do you run into which make you believe explicit predication
> is required?
> e.g. %sub = fsub <4 x float> %x, %y
> %result = select <4 x i1> %M, <4 x float> %sub, undef
That is semantically incorrect. According to IR semantics, the fsub is fully evaluated before the select comes along. It could trap for elements where %M is 0, whereas a masked intrinsic conveys the proper semantics of masking traps for masked-out elements. We need intrinsics and eventually (IMHO) fully first-class predication to make this work properly.
> My context for these questions is that my experience recently w/o
> existing masked intrinsics shows us missing fairly basic
> optimizations, precisely because they weren't able to reuse all of the
> existing infrastructure. (I've been working on
> SimplifyDemandedVectorElts recently for exactly this reason.) My
> concern is that your EVL proposal will end up in the same state.
I think that's just the nature of the beast. We need IR-level support for masking and we have to teach LLVM about it.
LLVM Developers mailing list
llvm-dev at lists.llvm.org
More information about the llvm-dev