[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?
Serge Pavlov via llvm-dev
llvm-dev at lists.llvm.org
Mon Sep 20 10:09:35 PDT 2021
MSVC documentation says: “Special values (NaN, +infinity, -infinity, -0.0)
may not be propagated or behave strictly according to the IEEE-754
standard”. Such exclusion is necessary to apply transformations that are
suitable for real numbers only, like `x * 0 -> 0`. NaNs in arithmetic
operations propagate from input to output, in most operations if an operand
is NaN, the result is also NaN. `isnan` has nothing with NaN propagation,
it just makes the check. The documentation does not provide justification
for removal of `isnan`.
all the compilers document that they are free to optimize as if there were
> no NaNs, and they then do whatever is best for their implementation.
Exactly. Leaving `isnan` in the code makes compiler behavior more
consistent and convenient for users. Clang also can go this way.
Do you have a concrete reason why a pragma is unsuitable?
I described the concerns in the reply to Mehdi Amini's message.
Thanks,
--Serge
On Mon, Sep 20, 2021 at 11:39 PM Chris Tetreault <ctetreau at quicinc.com>
wrote:
> You’re confusing implementation details (you have a Godbolt link that
> shows that MSVC just happens to not remove the isnan call) with documented
> behavior (I provided a link to the MSVC docs that shows that no promises
> are made with respect to NaN). The fact is that no compiler (Maybe ICC
> does, I don’t know, I haven’t checked. I bet their docs say something
> similar to MSVC, clang, and GCC though.) guarantees that isnan(x) will not
> be optimized out with fast-math enabled. There is no inconsistency: all the
> compilers document that they are free to optimize as if there were no NaNs,
> and they then do whatever is best for their implementation. If you think
> this is inconsistent, then let me tell you about that time I dereferenced a
> null pointer and it didn’t segfault.
>
>
>
> Now, many people have suggested in this thread that a pragma be added. I
> personally fully support this proposal. I think it’s a very clean solution,
> and any non-trivial portable codebase probably already has a library of
> preprocessor macros that abstract this sort of thing. Do you have a
> concrete reason why a pragma is unsuitable?
>
>
>
> *From:* Serge Pavlov <sepavloff at gmail.com>
> *Sent:* Monday, September 20, 2021 1:23 AM
> *To:* Mehdi AMINI <joker.eph at gmail.com>
> *Cc:* Chris Tetreault <ctetreau at quicinc.com>; llvm-dev at lists.llvm.org;
> cfe-dev at lists.llvm.org
> *Subject:* Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in
> fast-math mode?
>
>
>
> *WARNING:* This email originated from outside of Qualcomm. Please be wary
> of any links or attachments, and do not enable macros.
>
> On Fri, Sep 17, 2021 at 11:17 PM Mehdi AMINI <joker.eph at gmail.com> wrote:
>
> On Thu, Sep 16, 2021 at 11:19 PM Serge Pavlov <sepavloff at gmail.com> wrote:
>
> On Fri, Sep 17, 2021 at 10:53 AM Mehdi AMINI <joker.eph at gmail.com> wrote:
>
> On Thu, Sep 16, 2021 at 8:23 PM Serge Pavlov via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
> On Fri, Sep 17, 2021 at 3:11 AM Chris Tetreault <ctetreau at quicinc.com>
> wrote:
>
> The difference there is that doing pointer arithmetic on null pointers
> doesn't *usually* work, unless you turn on -ffast-pointers.
>
> It seems to me that most confusion related to -ffast-math is likely
> caused by people who are transitioning to using it. I have some codebase,
> and I turn on fast math, and then a few months down the road I notice a
> strangeness that I did not catch during the initial transition period. If
> you're writing new code with fast-math, you don't do things like try to use
> NaN as a sentinel value in a TU with fast math turned on. This is the sort
> of thing you catch when you try to transition an existing codebase. Forgive
> me for the uncharitable interpretation, but it's much easier to ask the
> compiler to change to accommodate your use case than it is to refactor your
> code.
>
>
>
> It is a common way to explain problems with -ffinite-math-only by user
> ignorance. However user misunderstandings and complaints may indicate a
> flaw in compiler implementation, which I believe we have in this case.
>
>
>
> Using NaN as sentinels is a natural way when you cannot spend extra memory
> for keeping flags for each item, spend extra cycles to read that flag and
> do not want to pollute cache. It does not depend on reading documentation
> or writing the code from scratch. It is simply the best solution for
> storing data. If performance of the data processing is critical,
> -ffast-math is a good solution. This is a fairly legitimate use case. The
> fact that the compiler does not allow it is a compiler drawback.
>
>
>
>
> To me, I think Mehdi had the best solution: The algorithm that is the
> bottleneck, and experiences the huge speedup using fast-math should be
> separated into its own source file. This source file, and only this source
> file should be compiled with fast-math. The outer driver loop should not be
> compiled with fast math. This solution is clean, (probably) easy, and
> doesn't require a change in the compiler.
>
>
>
> It is a workaround, it works in some cases but does not in others. ML
> kernel often is a single translation unit, there may be no such thing as
> linker for that processor. At the same time it is computation intensive and
> using fast-math in it may be very profitable.
>
>
>
> Switching mode in a single TU seems valuable, but could this be handled
> with pragmas or function attributes instead?
>
>
>
> GCC allows it by using `#pragma GCC optimize()`, but clang does not
> support it. No suitable function attribute exists for that.
>
>
>
> Right, I know that clang does not support it, but it could :)
>
> So since we're looking at what provides the best user-experience: isn't
> that it? Shouldn't we look into providing this level of granularity?
> (whether function-level or finer grain)
>
>
>
> It could mitigate the problem if it were implemented. A user who needs to
> handle NaNs in -ffinite-math-only compilation and writes the code from
> scratch could use this facility to get things working. I also think such
> pragma, implemented with enough degree of flexibility, could be useful
> irrespective of this topic.
>
>
>
> However, in general it does not solve the problem. The most important
> issue which remains unaddressed is inconsistency of the implementation.
>
>
>
> The handling of `isnan` in -ffinite-math-only by clang is not consistent
> because:
>
> - It differs from what other compilers do. Namely MSVC and Intel compiler
> do not throw away `isnan` in this mode: https://godbolt.org/z/qTaz47qhP.
>
> - It depends on optimization options. With -O2 the check is removed but
> with -O0 remains: https://godbolt.org/z/cjYePv7s7. Other options also can
> affect the behavior, for example with `-ffp-model=strict` the check is
> generated irrespective of the optimization mode (see the same link).
>
> - It is inconsistent with libc implementations. If `isnan` is provided by
> libc, it is a real check, but the compiler may drop it.
>
> It would not be an issue if `isnan` removal were just an optimization. It
> however changes semantics in the presence of NaNs, so such removal can
> break user code.
>
>
>
> In the typical use case a user puts a call to `isnan` to ensure no
> operations on NaNs occur. The call can also be present in some header that
> implements some functionality for the general case. It may work because
> `isnan` is provided by libc. Later on when configuration changes or libc is
> updated the code may be broken, because implementation of `isnan` changes,
> as it happened after https://reviews.llvm.org/D69806.
>
>
>
> If clang kept calls to `isnan`, it would be consistent with ICC and MSVC
> and with all libc implementations. The behavior would be different from
> gcc, but clang would be on the winning side, because the number of programs
> that work with clang would be larger.
>
>
>
> Also if we agree that NaNs can appear in the code compiled with
> -ffinite-math-only, there must be a way to check if a number is a NaN.
>
>
>
> Thanks,
>
> --Serge
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210921/8a17c919/attachment.html>
More information about the llvm-dev
mailing list