<div dir="ltr"><div dir="ltr">On Mon, Sep 20, 2021 at 12:40 PM Chris Tetreault via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">


<div lang="EN-US" style="overflow-wrap: break-word;">

<div class="gmail-m_-1752228868123682171WordSection1">

<p class="MsoNormal">You’re confusing implementation details (you have a Godbolt link that shows that MSVC just happens to not remove the isnan call) with documented behavior (I provided a link to the MSVC docs that shows that no promises are made with respect

 to NaN). The fact is that no compiler (Maybe ICC does, I don’t know, I haven’t checked. I bet their docs say something similar to MSVC, clang, and GCC though.) guarantees that isnan(x) will not be optimized out with fast-math enabled. There is no inconsistency:

 all the compilers document that they are free to optimize as if there were no NaNs, and they then do whatever is best for their implementation. If you think this is inconsistent, then let me tell you about that time I dereferenced a null pointer and it didn’t

 segfault.</p></div></div></blockquote><div><br></div><div>+1.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div lang="EN-US" style="overflow-wrap: break-word;"><div class="gmail-m_-1752228868123682171WordSection1"><p class="MsoNormal"><u></u><u></u></p>

<p class="MsoNormal"><u></u>Now, many people have suggested in this thread that a pragma be added. I personally fully support this proposal. I think it’s a very clean solution, and any non-trivial portable codebase probably already has a library of preprocessor macros

 that abstract this sort of thing. Do you have a concrete reason why a pragma is unsuitable?</p></div></div></blockquote><div><br></div><div>I think that there are two questions in this thread.</div><div>- How should fast-math mode actually behave? [Maybe we're settled on the "NANs are SNANs and signaling operations produce unspecified values" model. Gee I hope so.]</div><div>- Should switching into/out-of fast-math mode be controlled only by a TU-level command line option, or should there also be a pragma for it?</div><div>(Btw, multiply these questions by the number of different modes we support; I've consciously been trying to phrase everything in terms of NANs, but Serge likes to talk about -ffinite-math-only, where not just NANs but also INF and -INF are verboten. And then there's the <a href="https://gcc.gnu.org/wiki/FloatingPointMath">-fno-signed-zeros option</a>, which <i>does not forbid</i> -0.0, but does permit it to be treated as a-zero-value-of-unspecified-sign. I think -ffast-math probably also forbids subnormals... but maybe it just treats them as either-their-actual-value-or-zero-of-the-appropriate-sign.)</div><div><br></div><div>Anyway, should there be a pragma in addition to the TU-level command line option?:</div><div><br></div><div>There must be a command-line option, anyway — I mean, it already exists (-ffast-math, etc). Pragmas are basically <i>about</i> taking some command-line decision and allowing the decision to be made more granularly. Look at `#pragma GCC diagnostic ignored "-Wfoo"`, for example; it's expressed in terms of the command-line option. So if Clang were to support something like</div><div>    #pragma GCC optimize("ffast-math")  // cf. #pragma GCC optimize("O2")</div><div>that would still be expressed in terms of the command-line option, and hopefully both the option and the pragma would end up setting the same internal bits.</div><div><br></div><div>However, pragmas are hard to get right. Consider:</div><div><br></div><div>    double unoptimized(double x) { return (x + 1) > x; }</div><div>    #pragma GCC optimize("ffast-math")</div><div>    bool optimized(double x) { return unoptimized(x+1); }</div><div><div>    #pragma GCC optimize("fno-fast-math")</div></div><div><div>    int main() {</div><div>        return optimized(HUGE_VAL);</div><div>    }</div><div><br></div><div>The compiler would have to think about what it means to inline `unoptimized` into `optimized`.  The arithmetic in `optimized` produces INF, but then it's passed to `unoptimized`, which is not marked as fast-math, so I guess the compiler can't optimize `(x+1) > x` into `true` in that context?  It's <i>at least</i> confusing and subtle for the compiler vendor to get right; and possibly philosophically confusing as well.</div><div>Alternatively, you could forbid inlining between functions with different optimization levels... but that's <i>clearly</i> a terrible idea, right?</div><div><br></div><div>And of course some programmer is going to try something dumb like</div><div><br></div><div>    #pragma GCC optimize("ffast-math")<br></div><div>    #define REAL_ISNAN(x) std::isnan(x)</div><div><div><div>    #pragma GCC optimize("fno-fast-math")</div></div><div><br class="gmail-Apple-interchange-newline"></div></div></div><div>which "of course" won't work, but who's going to explain it to them?</div><div><br></div><div>Not to mention, if the pragma is active at the top of the TU where some template or implicitly defaulted special member is defined, but then it's not active at the point where the template is instantiated or the special member is implicitly defined... what the heck happens in <i>that</i> case? and who's going to write the StackOverflow answer about it?</div><div><br></div><div>Basically, the translation unit is the <i>natural</i> unit of... hmm... translation. There's very little return-on-investment involved in trying to circumvent that.</div><div><br></div><div>–Arthur</div></div></div>