[llvm-dev] [cfe-dev] Do we need intrinsics for floating-point classification functions?

Wed Sep 22 01:35:24 PDT 2021

On Wed, Sep 22, 2021 at 1:57 PM Roman Lebedev <lebedev.ri at gmail.com> wrote:

> On Wed, Sep 22, 2021 at 7:17 AM Serge Pavlov <sepavloff at gmail.com> wrote:
> >
> > The review was done in https://reviews.llvm.org/D104854.
> After which a revert was requested by *multiple* people,
> said feedback was plainly ignored, and someone else had to do it.
> https://llvm.org/docs/DeveloperPolicy.html#patch-reversion-policy

This patch was reviewed according to the usual procedure. The intrinsic is
also ordinary, we usually do not discuss every intrinsic in such a way. I
don't know why it caused a negative reaction from some people.

> The dismissiveness (one-sidedness?) of replies, especially in
> "[cfe-dev] Should isnan be optimized out in fast-math mode?"
> also don't quite inspire optimism.
> So forgive me if i'm erring on the safe side here.
>

I am sorry if my replies were considered by someone as dismissive.
Attempting to convince the community is not a one-sidedness. I tried to
reply to each new statement until it became clear that the proposal has no
sufficient support.

> > The design was discussed there, in
> https://lists.llvm.org/pipermail/llvm-dev/2021-August/152257.html and in
> this thread. No alternatives were proposed and no disagreement was
> expressed.
> >
> > If somebody has concerns or objections, they can express them here or
> in  https://reviews.llvm.org/D104854 if they are about the implementation.
> I do have concerns.
> While i agree that some solution is needed for StrictFP,
> I don't see consensus (aka, an explicit agreement by a third party
> that the proposed design makes sense to them) on the RFC's,
> and i do object to plainly recommitting any patches given all of this.
>

This RFC is devoted to the discussion of the design of this intrinsic. So
far it contains constructive discussion and does not contain any
objections. The previous discussion in
https://lists.llvm.org/pipermail/llvm-dev/2021-August/152257.html were also
supportive.

I have no intention to restore the patchset silently, this is why I resume
this discussion. Anyone is welcome to share their concerns, objections and
other thoughts here.

Thanks,
--Serge

> > Thanks,
> > --Serge
>
> Roman
>
> > On Wed, Sep 22, 2021 at 3:19 AM Roman Lebedev <lebedev.ri at gmail.com>
> wrote:
> >>
> >> Given just how contentious all this has been so far,
> >> i would strongly suggest for that to go through review.
> >>
> >> Roman
> >>
> >> On Tue, Sep 21, 2021 at 8:39 PM Serge Pavlov via cfe-dev
> >> <cfe-dev at lists.llvm.org> wrote:
> >> >
> >> > Hi all,
> >> >
> >> > If nobody argues, in a couple of days I will put back the
> `llvm.isnan` implementation excluding the changes for fast-mode.
> >> >
> >> > Thanks,
> >> > --Serge
> >> >
> >> >
> >> > On Fri, Sep 3, 2021 at 1:46 PM Serge Pavlov <sepavloff at gmail.com>
> wrote:
> >> >>
> >> >> Thank you for summarizing.
> >> >>
> >> >> I would prefer Option 2, a separate tool for separate tasks. It
> seems to me that a specialized function is easier to implement than a
> universal one. As implementation of llvm.isnan demonstrated, there may be
> various issues in implementing such functions for different targets. It is
> easier to provide optimized versions of small functions which are used more
> frequently than `fpclassify`. Besides, `fpclassify` may be used itself and
> a user may choose arbitrary constants, which complicates code generation.
> It seems that it is more acceptable to tolerate some inefficiency in
> `fpclassify` than in basic classification intrinsics. And yes, we should
> consider `signbit` with them.
> >> >>
> >> >> Another consideration in favor of dedicated intrinsic rather than
> ordinary function. Compiler may do some optimizations when it knows the
> semantics of the function. In https://reviews.llvm.org/D104854 this
> intrinsic was optimized out if its argument was  provided by an operation
> with 'nnan' flag. We can also think about an optimization that determines
> basic blocks guarded by `llvm.isnan` and assign flag 'nnan' in them. It
> could make code faster in many practical cases, and make use of notorious
> -ffast-math less attractive.
> >> >>
> >> >> Thanks,
> >> >> --Serge
> >> >>
> >> >>
> >> >> On Fri, Sep 3, 2021 at 12:22 PM James Y Knight <jyknight at google.com>
> wrote:
> >> >>>
> >> >>> As you say, we don't strictly need a new intrinsic -- we can emit
> the code to do the correct integer-based bit-checking in the frontend -- as
> was done before. Yet, that is not ideal. The rationale to have an intrinsic
> instead of the frontend generating integer-bit-manipulation seem good, IMO.
> >> >>>
> >> >>> So the question that arises: which new intrinsic(s) do we want to
> add? I list 3 options below. I think Option 1 should be discarded from
> consideration, and I'm not sure which of 2 or 3 is best. I'm leaning
> towards 3 -- it seems like it may be simpler -- although I'm not certain.
> >> >>>
> >> >>> Option 1: we could theoretically address the problem with an
> llvm.experimental.constrained.superquiet_fcmp, which would be the same as
> fcmp, except it would not raise an fp exception even for an sNAN. This
> would be a straightforward substitution for fcmp in Clang's existing
> codegen, and it should work.
> >> >>>
> >> >>> However, I don't think this would be a good idea, because it's a
> poor match for hardware. The "superquiet_fcmp" operation doesn't map to CPU
> functionality in any ISA i'm aware of. Generally, the implementation would
> have to be to filter out sNANs prior to using the hardware fp-compare
> instruction. And how you you detect an sNAN without raising an fp
> exceptoin? Fall down to integer ops. But by doing that, you've done work
> almost equivalent to what you needed for the actual classification function
> that you were trying to implement in the first place. Not real useful.
> >> >>>
> >> >>> Option 2: We could add a whole family of classification intrinsics.
> I think that would be:
> >> >>>
> >> >>> llvm.isnan
> >> >>> llvm.issignaling
> >> >>> llvm.isinf
> >> >>> llvm.isfinite
> >> >>> llvm.isnormal
> >> >>> llvm.issubnormal
> >> >>> llvm.iszero
> >> >>> (Note, some of these are missing corresponding __builtin_is* in
> Clang at the moment -- we have no __builtin_issignaling,
> __builtin_issubnormal, or __builtin_iszero. Probably ought to.)
> >> >>>
> >> >>> We don't necessarily need an intrinsic for fpclassify if we have
> the above intrinsics, since it can be built with llvm.isnan, llvm.isinf,
> llvm.iszero, and llvm.isnormal.
> >> >>>
> >> >>> Option 3: Add only an fpclassify intrinsic.
> >> >>>
> >> >>> That is, something like:
> >> >>>    i32 llvm.fpclassify.i32.f32(i32 %if_snan, i32 %if_qnan, i32
> %if_infinite, i32 %if_normal, i32 %if_subnormal, i32 %if_zero, float %value)
> >> >>> which classifies the given value, returning the value of the
> argument corresponding to its categorization. We can say the %if_* args are
> required to be constant integers, if we like, for simplicity of
> implementation.
> >> >>>
> >> >>> Thus, Clang would codegen __builtin_isnan/etc like:
> >> >>>   %isnan = call i1 llvm.fpclassify.i1.f32(i1 1, i1 1, i1 0, i1 0,
> i1 0, i1 0, float %value)
> >> >>> And for fpclassify, we might generate something like:
> >> >>>   %ret = call i32 llvm.fpclassify.i32.f32(i32 0, i32 0, i32 1, i32
> 4, i32 3, i32 2, float %value)
> >> >>>
> >> >>> On most architectures, we'd expand this intrinsic into appropriate
> integer operations (skipping the parts of classification which are
> irrelevant for the given constant arguments), since there's no
> corresponding hardware instructions available for float classification.
> Or, for non-strictfp functions, we could continue to expand into an
> fcmp-based set of tests...although looking at the asm we currently
> generate, the integer versions may well be faster, other than isnan.
> >> >>>
> >> >>> On SystemZ/s390x, this intrinsic would translate almost directly
> into the "test data class" instruction -- so long as the %if_* arguments
> are all 0/1. That's kinda nice. ("Test data class" takes a fp value and a
> bitmask, and returns true if a bit is set in the position corresponding to
> the classification of the fp value.)
> >> >>>
> >> >>>
> >> >>> Separately, we have the signbit operation. I think that's the only
> other operation that needs to be addressed related to this RFC. Currently,
> Clang always generates integer bit-ops for __builtin_signbit in the
> frontend. This is arguably OK as is. Yet, completing the set of IR fp
> classification intrinsics seems like it'd be a good idea. So, we could also
> (along with any of the above options) add:
> >> >>>   i1 llvm.signbit.f32(float %value)
> >> >>>
> >> >>>
> >> >>> On Thu, Sep 2, 2021 at 8:33 AM Serge Pavlov via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >> >>>>
> >> >>>> Hi all,
> >> >>>>
> >> >>>> Some time ago a new intrinsic `llvm.isnan` was introduced, which
> was intended to represent IEEE-754 operation `isNaN` as well as a family of
> C library functions `isnan*`. Then a concern was raised (see
> https://reviews.llvm.org/D104854) that this functionality should be
> removed. Discussion in the subsequent RFC (
> https://lists.llvm.org/pipermail/llvm-dev/2021-August/152257.html) came
> to consensus that such intrinsic is necessary. Nevertheless the patches
> related to the new intrinsic were reverted. I have to restart the
> discussion in hope to convince the community that this intrinsic and other
> classification functions are necessary.
> >> >>>>
> >> >>>> There are two main reasons why this intrinsic is necessary:
> >> >>>> 1. It allows correct implementation of `isnan` if strict floating
> point semantics is in effect,
> >> >>>> 2. It allows preserving the check in -ffast-math compilation.
> >> >>>>
> >> >>>> To facilitate the discussion let's concentrate on the first
> problem.
> >> >>>>
> >> >>>> Previously the frontend intrinsic `__builtin_isnan` was converted
> into `cmp uno` during IR generation in clang codegen. This solution is not
> suitable if FP exceptions are not ignored, because compare instructions
> raise exceptions if its argument is signaling NaN. Both IEEE-754 (5.7.2) an
> C standard  (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2596.pdf,
> F.3p6) demand that this function does not raise floating point exceptions.
> There was no target-independent IR construct that could represent `isnan`.
> >> >>>>
> >> >>>> This drawback was significant enough and some attempts to
> alleviate it were undertaken. In https://reviews.llvm.org/D95948 `isnan`
> was implemented using integer operations in strictfp functions. It however
> is not suitable for targets where a more efficient way exists, like
> dedicated instruction. Another solution was implemented in
> https://reviews.llvm.org/D96568, where a hook
> `clang::TargetCodeGenInfo::testFPKind` was introduced, which injects target
> specific code into IR. Such a solution makes IR more target-dependent and
> prevents some IR-level optimizations.
> >> >>>>
> >> >>>> To have a solution suitable for all cases, a new intrinsic
> function `llvm.isnan` was introduced (https://reviews.llvm.org/D104854).
> It protects the check from undesirable optimizations and preserves it till
> selector, where it can be lowered in optimal for a particular target way.
> >> >>>>
> >> >>>> Other classification functions also need their own intrinsics. In
> strictfp mode even a check for zero (`iszero`) cannot be made by comparing
> a value against zero, - if the value is signaling NaN, FP exceptions would
> be raised. James Y Knight in the previous discussion (
> https://lists.llvm.org/pipermail/llvm-dev/2021-August/152282.html) listed
> such "non-computational" functions, which should not signal if provided
> with an sNAN argument.
> >> >>>>
> >> >>>> It looks like new intrinsic is the only consistent and in
> target-agnostic way to implement these checks in all environments including
> the case when FP exceptions are not ignored.
> >> >>>>
> >> >>>> Any feedback is welcome.
> >> >>>
> >> >>>
> >> >>>
> >> >
> >> > _______________________________________________
> >> > cfe-dev mailing list
> >> > cfe-dev at lists.llvm.org
> >> > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210922/476c931a/attachment.html>