[cfe-dev] RFC: Add New Set of Vector Math Builtins
Craig Topper via cfe-dev
cfe-dev at lists.llvm.org
Tue Sep 28 20:48:12 PDT 2021
On Tue, Sep 28, 2021 at 2:10 AM Florian Hahn <florian_hahn at apple.com> wrote:
> Hi Craig,
> > On Sep 27, 2021, at 23:54, Craig Topper <craig.topper at gmail.com> wrote:
> > Hi Florian,
> > I have a few questions about thereduction builtins.
> Thanks for taking a look!
> > llvm.reduce.fadd is currently defined as ordered unless the reassociate
> fast math flag is present. Are you proposing to change that to make it
> That’s a good point and I forgot to explicitly call this out! The
> reduction builtin unfortunately cannot express pairwise reductions and the
> reassoicate flag would be too permissive. An initial lowering in Clang
> could just generate the pairwise reduction tree directly, but down the road
> I anticipate improving the reduction builtin to allow expressing pairwise
> reductions. This would probably be helpful for parts of the middle-end too
> which at the moment manually emit pairwise reduction trees (e.g. in the
> epilogue of vector loops with reductions).
I didn't think the vectorizers used pairwise reductions. The cost modelling
flag for it was removed in https://reviews.llvm.org/D105484
FWIW, the X86 backend barely uses the pairwise reduction instructions like
haddps. They have a suboptimal implementation on most CPUs that makes them
not good for reducing over a single register.
> > llvm.reduce.fmin/fmax change behavior based on the nonans fast math
> flag. And I think they always imply no signed zeros regardless of whether
> the fast math flag is present. The vectorizers check the fast math flags
> before creating the intrinsics today. What are the semantics of the
> proposed builtin?
> I tried to specify NaN handling the the `Special Values` section. At the
> moment it says "If exactly one argument is a NaN, return the other
> argument. If both arguments are NaNs, return a NaN”. This should match both
> the NaN handling of llvm.minnum and libm’s fmin(f). Note that in the
> original email, the Special Values section still includes a mention to
> fmax. That reference should be removed.
> The current proposal does not specifically talk about signed zeros, but I
> am not sure we have to. The proposal defines min/max as returning the
> smaller/larger value. Both -0 and +0 are equal, so either can be returned.
> I think this again matches libm’s fmin(f)’s and llvm.minnum’s behavior
> although llvm.minnum’ definition calls this out explicitly by stating
> explicitly what happens when called with equal arguments. Should the
> proposed definitions also spell that out?
I just noticed that the ExpandReductions pass uses fcmp+select for
expanding llvm.reduce.fmin/fmax with nonans. But SelectionDAG expands it
using ISD::FMAXNUM and ISD::FMINNUM. I only looked at ExpandReductions and
saw the nonans check there, but didn't realize it was using fcmp+select.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cfe-dev