[llvm-dev] [RFC] Vector Predication

David Greene via llvm-dev llvm-dev at lists.llvm.org
Mon Feb 25 07:22:07 PST 2019


"Saito, Hideki" <hideki.saito at intel.com> writes:

> Constrained intrinsics should be extended to take a mask
> parameter. Scalar call sites should be verified to have constant TRUE
> mask value.
>
>     llvm.experimental.constrained.fdiv(b[i], c[i], tonearest, maytrap),

Yes, they should and I believe that is part of Simon's proposal.

                              -David

> -----Original Message-----
> From: David Greene [mailto:dag at cray.com] 
> Sent: Thursday, February 21, 2019 10:28 AM
> To: Philip Reames <listmail at philipreames.com>
> Cc: Simon Moll <moll at cs.uni-saarland.de>; via llvm-dev
> <llvm-dev at lists.llvm.org>; Maslov, Sergey V
> <sergey.v.maslov at intel.com>; Saito, Hideki <hideki.saito at intel.com>;
> Topper, Craig <craig.topper at intel.com>
> Subject: Re: [llvm-dev] [RFC] Vector Predication
>
> Philip Reames <listmail at philipreames.com> writes:
>
>> I was not suggesting that you rely on pattern matching predication for 
>> correctness.  As you point out, that's obviously incorrect.  I was 
>> assuming that you have a correct but slow lowering for the select 
>> form.  I was suggesting your ISEL attempt to use a predicated 
>> instruction where possible for performance.
>
> The whole reason for using predication is performance.  In the
>> presence of traps, the select form should never even be created in
>> the first place.
>
>> The point about pattern complexity is an inherent difficulty w/any 
>> intermediate IR.  We do quite well pattern matching complicate 
>> constructs in existing backends - x86 SIMD comes to mind - and I'm 
>> unconvinced that predication is somehow inherently more difficult.
>
> Our experience tells us otherwise.  Intrinsics, and ultimately first-class IR support is the most reasonable way to get correctness and performance.  How should we translate this to get predicated instructions out?
>
>   for (int i=...) {
>     if( fabs(c[i]) > epsilon) {
>       a[i] = b[i]/c[i];
>     }
>     else {
>       a[i] = 0;  
>     }
>   }
>
> We can't use select even with constrained intrinsics, because the constrained intrinsics only tell the optimizer they can't be speculated.
> This is not a legal translation:
>
>   %cond = fabs(c[i]) > epsilon
>   %temp = select %cond,
>     llvm.experimental.constrained.fdiv(b[i], c[i], tonearest, maytrap),
>     0
>   store a[i], %temp
>
> According to the IR, we've already speculated llvm.experimental.constrained.fdiv above the test.
>
> I believe the only way to safely do this with the current IR is via control flow and now we have to match complex control flow during isel.
> Who knows what other things passes may have put into our carefully constructed basic blocks?
>
> The ARM backend has (had?) logic for trying to match predicated scalar things.  I would not wish it on any codegen person.
>
>                             -David


More information about the llvm-dev mailing list