[llvm-dev] [cfe-dev] Should isnan be optimized out in fast-math mode?

Serge Pavlov via llvm-dev llvm-dev at lists.llvm.org
Fri Sep 10 21:20:35 PDT 2021


On Sat, Sep 11, 2021 at 2:39 AM Chris Tetreault <ctetreau at quicinc.com>
wrote:

> The problem is that math code is often templated, so `template <typename
> T>  MyMatrixT<T> safeMul(const MyMatrixT<T> & lhs …` is going to be in a
> header.
>

No problem, the user can write:
```
#ifdef __FAST_MATH__
#undef isnan
#define isnan(x) false
#endif
```
and put it somewhere in the headers.

On Sat, Sep 11, 2021 at 2:39 AM Chris Tetreault <ctetreau at quicinc.com>
wrote:

> Regardless, my position isn’t “there is no NaN”. My position is “you
> cannot count on operations on NaN working”.


Exactly. Attempts to express the condition of -ffast-math as restrictions
on types are not fruitful. I think it is the reason why GCC documentation
does not use simple and clear "there is no NaN" but prefers more
complicated wording about arithmetic.

On Sat, Sep 11, 2021 at 2:39 AM Chris Tetreault <ctetreau at quicinc.com>
wrote:

> I think working around these sorts of issues is something that C and C++
> developers are used to. These sorts of “inconsistent” between compilers
> behaviors is something we accept because we know it comes with improved
> performance. In this case, the fix is easy, so I don’t think this corner
> case is worth supporting. Especially when the fix is also just one line:
> ```
> #define myIsNan(x) (reinterpret_cast<uint32_t>(x) ==
> THE_BIT_PATTERN_OF_MY_SENTINEL_NAN)
> ```


It won't work in this way. If `x == 5.0`, then
`reinterpret_cast<uint32_t>(x) == 5`. What you need there is a bitcast.
Standard C does not have such. To emulate it a reinterpret_cast of memory
can be used: `*reinterpret_cast<int *>(&x)`. Another way is to use a union.
Both these solutions require operations with memory, which is not good for
performance, especially on GPU and ML cores. Of course, a smart compiler
can eliminate memory operation, but it does not have to do it always, as it
is only optimization. Moving a value between float and integer
pipelines also may incur a performance penalty. At the same time this check
often may be done with a single instruction.

Thanks,
--Serge
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210911/5a82a8f7/attachment.html>


More information about the llvm-dev mailing list