[llvm-dev] [RFC] Vector Predication
David Greene via llvm-dev
llvm-dev at lists.llvm.org
Mon Feb 25 07:22:07 PST 2019
"Saito, Hideki" <hideki.saito at intel.com> writes:
> Constrained intrinsics should be extended to take a mask
> parameter. Scalar call sites should be verified to have constant TRUE
> mask value.
>
> llvm.experimental.constrained.fdiv(b[i], c[i], tonearest, maytrap),
Yes, they should and I believe that is part of Simon's proposal.
-David
> -----Original Message-----
> From: David Greene [mailto:dag at cray.com]
> Sent: Thursday, February 21, 2019 10:28 AM
> To: Philip Reames <listmail at philipreames.com>
> Cc: Simon Moll <moll at cs.uni-saarland.de>; via llvm-dev
> <llvm-dev at lists.llvm.org>; Maslov, Sergey V
> <sergey.v.maslov at intel.com>; Saito, Hideki <hideki.saito at intel.com>;
> Topper, Craig <craig.topper at intel.com>
> Subject: Re: [llvm-dev] [RFC] Vector Predication
>
> Philip Reames <listmail at philipreames.com> writes:
>
>> I was not suggesting that you rely on pattern matching predication for
>> correctness. As you point out, that's obviously incorrect. I was
>> assuming that you have a correct but slow lowering for the select
>> form. I was suggesting your ISEL attempt to use a predicated
>> instruction where possible for performance.
>
> The whole reason for using predication is performance. In the
>> presence of traps, the select form should never even be created in
>> the first place.
>
>> The point about pattern complexity is an inherent difficulty w/any
>> intermediate IR. We do quite well pattern matching complicate
>> constructs in existing backends - x86 SIMD comes to mind - and I'm
>> unconvinced that predication is somehow inherently more difficult.
>
> Our experience tells us otherwise. Intrinsics, and ultimately first-class IR support is the most reasonable way to get correctness and performance. How should we translate this to get predicated instructions out?
>
> for (int i=...) {
> if( fabs(c[i]) > epsilon) {
> a[i] = b[i]/c[i];
> }
> else {
> a[i] = 0;
> }
> }
>
> We can't use select even with constrained intrinsics, because the constrained intrinsics only tell the optimizer they can't be speculated.
> This is not a legal translation:
>
> %cond = fabs(c[i]) > epsilon
> %temp = select %cond,
> llvm.experimental.constrained.fdiv(b[i], c[i], tonearest, maytrap),
> 0
> store a[i], %temp
>
> According to the IR, we've already speculated llvm.experimental.constrained.fdiv above the test.
>
> I believe the only way to safely do this with the current IR is via control flow and now we have to match complex control flow during isel.
> Who knows what other things passes may have put into our carefully constructed basic blocks?
>
> The ARM backend has (had?) logic for trying to match predicated scalar things. I would not wish it on any codegen person.
>
> -David
More information about the llvm-dev
mailing list