[cfe-dev] Complex arithmetic ignores -ffast-math after clang r219557, serious performance regressions

Thu Jun 25 11:54:10 PDT 2015

After building with clang 3.7svn recently, I saw a huge speed hit across much of our HPC and floating point DSP code. I looked at the asm output and it's riddled with calls to ___mulsc3, which is never inlined (preventing lots of other optimizations) and which includes a bunch of C99 Annex G-recommended branch conditions for range checks and whatnot. One of the purposes of -ffast-math has always been to disable these sort of checks, trusting the developer to ensure that they can't happen or will be handled upstream.

Explicitly writing out the real and imaginary component math in one of my critical sections was enough to confirm that the problem lies here and not elsewhere. However, doing this throughout all of our code would be prohibitive, and of course greatly reduces the readability of the code and presumably the ability for future compilers to optimize it in a way that I haven’t though of yet.

The relevant patch discussion in the mailing list is here: http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20141006/116248.html <http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20141006/116248.html> and includes a comment from hfinkel also requesting that the libcalls be skipped in fast-math mode. From what I can see there was no followup on this.

At the bare minimum I think these checks should be disabled within mulsc3 when ffast-math or the relevant subflag is enabled, and preferably that the library calls be skipped entirely as before, so that other compiler optimizations aren't prevented.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20150625/89533b6a/attachment.html>