[cfe-dev] CLang and ISO C math functions

Martin J. O'Riordan via cfe-dev cfe-dev at lists.llvm.org
Wed Aug 31 03:10:28 PDT 2016


When I updated our out of tree compiler to v3.9 RC3 from v3.8, I noticed a
number of large performance regressions, some tests using 5 times as many
instructions.  However, when I examined the test, I realised that something
quite different was going on concerning the ISO C math functions, and it is
not a true performance regression at all.

 

I will use the following example for this message, and this is compiled with
'-S -O3'.  The option '-ffast-math' is not used, and I have verified that
'-fmath-errno' is present in the '-cc1' options:

 

extern double exp(double);

extern double foo(double);

 

int useMathName() {

  if ((exp(1.0) < 2.71) || (exp(1.0) > 2.72))

    return -1;

  return 0;

}

 

int useOtherName() {

  if ((foo(1.0) < 2.71) || (foo(1.0) > 2.72))

    return -1;

  return 0;

}

 

With v3.8, the implementation of the function 'useMathName' was reduced to
simply 'return 0'.  The compiler elided the calls to 'exp', assumed the
value that would have been returned if it was called, decided that the two
tests would be 'false' and reduced the code-generation to 'return 0'.  This
does not happen for the other function 'useOtherName', and the code
generated is as expected.

 

After updating to v3.9 RC3, the compiler is no longer eliding the calls to
'exp' - probably because a bug was fixed since 'errno' could be changed -
but it is still presuming the returned value and eliding the tests, so the
function is now 2 consecutive calls to 'exp' and a 'return 0'.

 

I have verified that this is the case for the unaltered X86 v3.8
distribution version too.

 

I would expect this behaviour if '-ffast-math -fno-math-errno' was selected,
but it isn't, and I think that this is an invalid optimisation.  It also
means that some of my math functional tests are not reporting honestly (this
only happens when the argument(s) are constants).  Also, on our
architecture, 'double' is FP32, and it is probable that the compiler is
using the host platform's implementation which is FP64 for evaluating the
test expressions, and this will introduce precision differences that the
test will not detect - in my real tests, the test expression ranges are more
fine-grained to allow for legitimate FP32 ranges.

 

Thanks,

 

            MartinO

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160831/981bd6d2/attachment-0001.html>


More information about the cfe-dev mailing list