[llvm-dev] Vectorizing minimum without function attributes

Fri May 10 13:08:53 PDT 2019

https://reviews.llvm.org/D61802

On Fri, May 10, 2019 at 8:49 PM Nicolau Werneck <nwerneck at gmail.com> wrote:

> I have a patch proposal, should I just go ahead and start a pull request?
> Where should the test file go?
>
> On Tue, May 7, 2019 at 9:01 PM Sanjay Patel <spatel at rotateright.com>
> wrote:
>
>> Yes, if we can solve the motivating case by detecting the IR flag(s),
>> then I think that's a good fix. We probably want to
>> leave the function attribute check in place though until/unless we are
>> sure there is no chance of regression by removing that check.
>>
>> I'm looking again at some of the related bugs, and I'm hoping to solve
>> some of those too.
>>
>> On Mon, May 6, 2019 at 1:18 PM Nicolau Werneck <nwerneck at gmail.com>
>> wrote:
>>
>>> Thanks for checking this out. I think your second comment on 35538 must
>>> be precisely what I'm talking about.
>>>
>>> I came up with this IR that might be used for testing:
>>>
>>> ```
>>> define float @minloop(float* nocapture readonly)  #0 {
>>> top:
>>>   %1 = load float, float* %0
>>>   br label %loop
>>>
>>> loop:
>>>   %2 = phi i64 [ %8, %loop ], [ 1, %top ]
>>>   %3 = phi float [ %7, %loop ], [ %1, %top ]
>>>   %4 = getelementptr float, float* %0, i64 %2
>>>   %5 = load float, float* %4, align 4
>>>   %6 = fcmp fast olt float %3, %5
>>>   %7 = select i1 %6, float %3, float %5
>>>   %8 = add i64 %2, 1
>>>   %9 = icmp eq i64 %8, 65537
>>>   br i1 %9, label %out, label %loop
>>>
>>> out:
>>>   ret float %7
>>> }
>>>
>>> attributes #0 = { "no-nans-fp-math"="true"}
>>> ```
>>>
>>> I can get vectorized code if I use this:
>>> opt -S minloop.ll -O2 -force-vector-width=8
>>>
>>> Setting the attributes is necessary, and commenting it prevents the
>>> vectorization. Neither the fcmp fast flag or the command-line argument
>>> -fp-contract=fast have any effect.
>>>
>>> The code that seems relevant to this issue is 6 years old. In fact, its
>>> birthday was just yesterday!
>>> https://github.com/llvm/llvm-project/commit/d96e427eacdd4d48da3a1f93a64008ef4dadcc8a#diff-b3c08c1a517adb5ef635cdf9cf224904R3087
>>>
>>> If support for fast flags in fcmp was really introduced only after that,
>>> then this was probably just never updated, and the fix might be as simple
>>> as replacing `HasFunNoNaNAttr` by `I->hasNoNaNs()`. Does that sound like a
>>> possible PR?
>>>
>>>
>>>
>>> On Mon, May 6, 2019 at 3:31 PM Sanjay Patel <spatel at rotateright.com>
>>> wrote:
>>>
>>>> We have several problems with FP min/max optimization, so it would help
>>>> to see an IR example. But I agree that we should not require a function
>>>> attribute to enable the optimization.
>>>>
>>>> This might be the closest match based on the description:
>>>> https://bugs.llvm.org/show_bug.cgi?id=35538
>>>>
>>>> Other candidates:
>>>> https://bugs.llvm.org/show_bug.cgi?id=36982
>>>> https://bugs.llvm.org/show_bug.cgi?id=34149
>>>> https://bugs.llvm.org/show_bug.cgi?id=26956
>>>> https://bugs.llvm.org/show_bug.cgi?id=35284
>>>>
>>>>
>>>> On Sat, May 4, 2019 at 3:51 PM Nicolau Werneck via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>>> Thanks for the reply. I should say I'm actually working on 6.0, but I
>>>>> don't think this part of the code changed much since. These are traces I
>>>>> made with GDB optimizing a loop with floats and then integers, showing
>>>>> where they diverge:
>>>>> https://gist.github.com/nlw0/58ed9fda8e8944a9cb5e5a20f6038fcf
>>>>>
>>>>> This is the point I believe we need to set the function attribute for
>>>>> the vectorization to work with floats:
>>>>>
>>>>> https://github.com/llvm/llvm-project/blob/fd254e429ea103be8bab6271855c04919d33f9fb/llvm/lib/Analysis/IVDescriptors.cpp#L590
>>>>>
>>>>> Could this be a bug? It seems to me it just wasn't changed yet to
>>>>> depend only on instruction flags.
>>>>>
>>>>> I would gladly work on refactoring this if there's an opportunity, but
>>>>> I'm a complete newbie in this project. It would be great to hear from
>>>>> someone more knowledgeable what can be done about this issue, especially if
>>>>> turns out to be a very small patch!
>>>>>
>>>>>
>>>>> On Sat, May 4, 2019 at 3:41 PM Finkel, Hal J. <hfinkel at anl.gov> wrote:
>>>>>
>>>>>> On 5/4/19 6:36 AM, llvm-dev wrote:
>>>>>>
>>>>>> Greetings,
>>>>>>
>>>>>> The LLVM loop vectorizer does a great job handling reductions with
>>>>>> the `min(a, b)` function over an array of integers or floats. This finds
>>>>>> the smallest value of a list exploiting SIMD instructions, and works just
>>>>>> as well as a summation.
>>>>>>
>>>>>> Specifically with floats, though, using the `fcmp` instruction, the
>>>>>> vectorization seems to require the function attribute "no-nans-fp-math" to
>>>>>> be set. Just setting instruction flags is not enough.
>>>>>>
>>>>>>
>>>>>> fcmp takes fast-math flags now, but that wasn't always true (my
>>>>>> recollection is that was a capability added after the arithmetic
>>>>>> operations). In any case, I wonder if this is just a hold-over from before
>>>>>> fcmp took fast-math flags, or if this is an && condition that should be an
>>>>>> || condition.
>>>>>>
>>>>>>
>>>>>> This forces us to give up on fine-grained control of fast-math in the
>>>>>> code in order to benefit from this vectorization.
>>>>>>
>>>>>> How to overcome this? LLVM has intrinsic functions such as `minnum`
>>>>>> and `minimum` (`minnan`) that accurately represent the operation. This
>>>>>> could permit fine-grained control of fast-math flags, although the
>>>>>> vectorizer seems to ignore these intrinsics.
>>>>>>
>>>>>> Beyond this specific case, it would be nice to be sure when is it
>>>>>> ever necessary to set these function attributes, e.g.
>>>>>>
>>>>>> https://github.com/llvm/llvm-project/blob/8205a814a691bfa62fed911b58b0a306ab5efe31/clang/lib/CodeGen/CGCall.cpp#L1743-L1750
>>>>>>
>>>>>> What would be a way to control the vectorization for `min` without
>>>>>> having to rely on that function attribute? And furthermore, could LLVM
>>>>>> optimizations conceivably depend only on instruction flags, and not ever on
>>>>>> function attributes? What would be necessary to achieve this?
>>>>>>
>>>>>>
>>>>>> The goal has been to eliminate the dependence on the function
>>>>>> attributes once all of the necessary local flags are in place. Obviously I
>>>>>> could be missing something, but this just seems like a bug.
>>>>>>
>>>>>>  -Hal
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> --
>>>>>> Nicolau Werneck <nwerneck at gmail.com>
>>>>>> http://n <http://nwerneck.sdf.org>ic.hpavc.net
>>>>>>
>>>>>> _______________________________________________
>>>>>> LLVM Developers mailing listllvm-dev at lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Nicolau Werneck <nwerneck at gmail.com>
>>>>> http://n <http://nwerneck.sdf.org>ic.hpavc.net
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>>
>>>
>>> --
>>> Nicolau Werneck <nwerneck at gmail.com>
>>> http://n <http://nwerneck.sdf.org>ic.hpavc.net
>>>
>>
>
> --
> Nicolau Werneck <nwerneck at gmail.com>
> http://n <http://nwerneck.sdf.org>ic.hpavc.net
>

-- 
Nicolau Werneck <nwerneck at gmail.com>
http://n <http://nwerneck.sdf.org>ic.hpavc.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190510/a5fe0d75/attachment.html>