[cfe-dev] [llvm-dev] Should isnan be optimized out in fast-math mode?
Serge Pavlov via cfe-dev
cfe-dev at lists.llvm.org
Sun Sep 12 23:02:20 PDT 2021
I was also wrong about reinterpret_cast, sorry.
`reinterpret_cast<uint32_t>(float)` is an invalid construct. The working
construct is `reinterpret_cast<uint32_t&>(x)`. It however possesses the
same drawback, it requires `x` be in memory.
Thanks,
--Serge
On Sat, Sep 11, 2021 at 11:20 AM Serge Pavlov <sepavloff at gmail.com> wrote:
> On Sat, Sep 11, 2021 at 2:39 AM Chris Tetreault <ctetreau at quicinc.com>
> wrote:
>
>> The problem is that math code is often templated, so `template <typename
>> T> MyMatrixT<T> safeMul(const MyMatrixT<T> & lhs …` is going to be in a
>> header.
>>
>
> No problem, the user can write:
> ```
> #ifdef __FAST_MATH__
> #undef isnan
> #define isnan(x) false
> #endif
> ```
> and put it somewhere in the headers.
>
> On Sat, Sep 11, 2021 at 2:39 AM Chris Tetreault <ctetreau at quicinc.com>
> wrote:
>
>> Regardless, my position isn’t “there is no NaN”. My position is “you
>> cannot count on operations on NaN working”.
>
>
> Exactly. Attempts to express the condition of -ffast-math as restrictions
> on types are not fruitful. I think it is the reason why GCC documentation
> does not use simple and clear "there is no NaN" but prefers more
> complicated wording about arithmetic.
>
> On Sat, Sep 11, 2021 at 2:39 AM Chris Tetreault <ctetreau at quicinc.com>
> wrote:
>
>> I think working around these sorts of issues is something that C and C++
>> developers are used to. These sorts of “inconsistent” between compilers
>> behaviors is something we accept because we know it comes with improved
>> performance. In this case, the fix is easy, so I don’t think this corner
>> case is worth supporting. Especially when the fix is also just one line:
>> ```
>> #define myIsNan(x) (reinterpret_cast<uint32_t>(x) ==
>> THE_BIT_PATTERN_OF_MY_SENTINEL_NAN)
>> ```
>
>
> It won't work in this way. If `x == 5.0`, then
> `reinterpret_cast<uint32_t>(x) == 5`. What you need there is a bitcast.
> Standard C does not have such. To emulate it a reinterpret_cast of memory
> can be used: `*reinterpret_cast<int *>(&x)`. Another way is to use a
> union. Both these solutions require operations with memory, which is not
> good for performance, especially on GPU and ML cores. Of course, a smart
> compiler can eliminate memory operation, but it does not have to do it
> always, as it is only optimization. Moving a value between float and
> integer pipelines also may incur a performance penalty. At the same time
> this check often may be done with a single instruction.
>
> Thanks,
> --Serge
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20210913/5ae38745/attachment.html>
More information about the cfe-dev
mailing list