[llvm-dev] [RFC] Vector Predication

Simon Moll via llvm-dev llvm-dev at lists.llvm.org
Wed Feb 27 04:56:03 PST 2019


On 2/25/19 4:22 PM, David Greene wrote:
> "Saito, Hideki" <hideki.saito at intel.com> writes:
>
>> Constrained intrinsics should be extended to take a mask
>> parameter. Scalar call sites should be verified to have constant TRUE
>> mask value.
>>
>>      llvm.experimental.constrained.fdiv(b[i], c[i], tonearest, maytrap),
> Yes, they should and I believe that is part of Simon's proposal.
>
>                                -David

We will make them available as llvm.vp.constrained.fdiv(..etc..)

- Simon

>
>> -----Original Message-----
>> From: David Greene [mailto:dag at cray.com]
>> Sent: Thursday, February 21, 2019 10:28 AM
>> To: Philip Reames <listmail at philipreames.com>
>> Cc: Simon Moll <moll at cs.uni-saarland.de>; via llvm-dev
>> <llvm-dev at lists.llvm.org>; Maslov, Sergey V
>> <sergey.v.maslov at intel.com>; Saito, Hideki <hideki.saito at intel.com>;
>> Topper, Craig <craig.topper at intel.com>
>> Subject: Re: [llvm-dev] [RFC] Vector Predication
>>
>> Philip Reames <listmail at philipreames.com> writes:
>>
>>> I was not suggesting that you rely on pattern matching predication for
>>> correctness.  As you point out, that's obviously incorrect.  I was
>>> assuming that you have a correct but slow lowering for the select
>>> form.  I was suggesting your ISEL attempt to use a predicated
>>> instruction where possible for performance.
>> The whole reason for using predication is performance.  In the
>>> presence of traps, the select form should never even be created in
>>> the first place.
>>> The point about pattern complexity is an inherent difficulty w/any
>>> intermediate IR.  We do quite well pattern matching complicate
>>> constructs in existing backends - x86 SIMD comes to mind - and I'm
>>> unconvinced that predication is somehow inherently more difficult.
>> Our experience tells us otherwise.  Intrinsics, and ultimately first-class IR support is the most reasonable way to get correctness and performance.  How should we translate this to get predicated instructions out?
>>
>>    for (int i=...) {
>>      if( fabs(c[i]) > epsilon) {
>>        a[i] = b[i]/c[i];
>>      }
>>      else {
>>        a[i] = 0;
>>      }
>>    }
>>
>> We can't use select even with constrained intrinsics, because the constrained intrinsics only tell the optimizer they can't be speculated.
>> This is not a legal translation:
>>
>>    %cond = fabs(c[i]) > epsilon
>>    %temp = select %cond,
>>      llvm.experimental.constrained.fdiv(b[i], c[i], tonearest, maytrap),
>>      0
>>    store a[i], %temp
>>
>> According to the IR, we've already speculated llvm.experimental.constrained.fdiv above the test.
>>
>> I believe the only way to safely do this with the current IR is via control flow and now we have to match complex control flow during isel.
>> Who knows what other things passes may have put into our carefully constructed basic blocks?
>>
>> The ARM backend has (had?) logic for trying to match predicated scalar things.  I would not wish it on any codegen person.
>>
>>                              -David

-- 

Simon Moll
Researcher / PhD Student

Compiler Design Lab (Prof. Hack)
Saarland University, Computer Science
Building E1.3, Room 4.31

Tel. +49 (0)681 302-57521 : moll at cs.uni-saarland.de
Fax. +49 (0)681 302-3065  : http://compilers.cs.uni-saarland.de/people/moll



More information about the llvm-dev mailing list