<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class=""><font face="-apple-system-font" class=""><span class="" style="line-height: 16px;">After building with clang 3.7svn recently, I saw a huge speed hit across much of our HPC and floating point DSP code. I looked at the asm output and it's riddled with calls to ___mulsc3, which is never inlined (preventing lots of other optimizations) and which includes a bunch of </span></font><span class="" style="font-family: -apple-system-font; line-height: 16px;">C99 Annex G-recommended</span><span class="" style="line-height: 16px; font-family: -apple-system-font;"> branch conditions for range checks and whatnot. One of the purposes of -ffast-math has always been to disable these sort of checks, trusting the developer to ensure that they can't happen or will be handled upstream.</span></div><div class=""><span class="" style="line-height: 16px; font-family: -apple-system-font;"><br class=""></span></div><div class=""><font face="-apple-system-font" class=""><span class="" style="line-height: 16px;">Explicitly writing out the real and imaginary component math in one of my critical sections was enough to confirm that the problem lies here and not elsewhere. However, doing this throughout all of our code would be prohibitive, and of course greatly reduces the readability of the code and presumably the ability for future compilers to optimize it in a way that I haven’t though of yet.</span></font></div><div class=""><font face="-apple-system-font" class=""><span class="" style="line-height: 16px;"><br class=""></span></font></div><div class=""><font face="-apple-system-font" class=""><span class="" style="line-height: 16px;">The relevant patch discussion in the mailing list is here: <a href="http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20141006/116248.html" class="">http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20141006/116248.html</a> and includes a comment from hfinkel also requesting that the libcalls be skipped in fast-math mode. From what I can see there was no followup on this.</span></font></div><div class=""><font face="-apple-system-font" class=""><span class="" style="line-height: 16px;"><br class=""></span></font></div><div class=""><span class="" style="font-family: -apple-system-font; line-height: 16px;">At the bare minimum I think these checks should be disabled within mulsc3 when ffast-math or the relevant subflag is enabled, and preferably that the library calls be skipped entirely as before, so that other compiler optimizations aren't prevented.</span></div></body></html>