[cfe-dev] RFC: Add New Set of Vector Math Builtins

Tue Sep 28 20:48:12 PDT 2021

On Tue, Sep 28, 2021 at 2:10 AM Florian Hahn <florian_hahn at apple.com> wrote:

> Hi Craig,
>
> > On Sep 27, 2021, at 23:54, Craig Topper <craig.topper at gmail.com> wrote:
> >
> > Hi Florian,
> >
> > I have a few questions about thereduction builtins.
> >
>
> Thanks for taking a look!
>
> > llvm.reduce.fadd is currently defined as ordered unless the reassociate
> fast math flag is present. Are you proposing to change that to make it
> pairwise?
> >
>
> That’s a good point and I forgot to explicitly call this out! The
> reduction builtin unfortunately cannot express pairwise reductions and the
> reassoicate flag would be too permissive. An initial lowering in Clang
> could just generate the pairwise reduction tree directly, but down the road
> I anticipate improving the reduction builtin to allow expressing pairwise
> reductions. This would probably be helpful for parts of the middle-end too
> which at the moment manually emit pairwise reduction trees (e.g. in the
> epilogue of vector loops with reductions).
>

I didn't think the vectorizers used pairwise reductions. The cost modelling
flag for it was removed in https://reviews.llvm.org/D105484

FWIW, the X86 backend barely uses the pairwise reduction instructions like
haddps. They have a suboptimal implementation on most CPUs that makes them
not good for reducing over a single register.

>
> > llvm.reduce.fmin/fmax change behavior based on the nonans fast math
> flag. And I think they always imply no signed zeros regardless of whether
> the fast math flag is present. The vectorizers check the fast math flags
> before creating the intrinsics today. What are the semantics of the
> proposed builtin?
>
>
> I tried to specify NaN handling the the `Special Values` section. At the
> moment it says "If exactly one argument is a NaN, return the other
> argument. If both arguments are NaNs, return a NaN”. This should match both
> the NaN handling of llvm.minnum and libm’s fmin(f). Note that in the
> original email, the Special Values section still includes a mention to
> fmax. That reference should be removed.
>
> The current proposal does not specifically talk about signed zeros, but I
> am not sure we have to. The proposal defines min/max as returning the
> smaller/larger value. Both -0 and +0 are equal, so either can be returned.
> I think this again matches libm’s fmin(f)’s and llvm.minnum’s behavior
> although llvm.minnum’ definition calls this out explicitly by stating
> explicitly what happens when called with equal arguments. Should the
> proposed definitions also spell that out?
>

I just noticed that the ExpandReductions pass uses fcmp+select for
expanding llvm.reduce.fmin/fmax with nonans. But SelectionDAG expands it
using ISD::FMAXNUM and ISD::FMINNUM. I only looked at ExpandReductions and
saw the nonans check there, but didn't realize it was using fcmp+select.

>
>
>
> Cheers,
> Florian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20210928/91217664/attachment.html>