[cfe-dev] Clang not generating pow_finite with -ffast-math
Hal Finkel
hfinkel at anl.gov
Fri Sep 13 14:00:53 PDT 2013
Sriram,
I don't know why we do the pow intrinsic replacement; the semantics of the intrinsic are defined to be identical to the libm function call. It might just be something that has been that way for a long time, and now we should do something else.
This also brings up another issue, on the LLVM side of things, should be default the library-function intrinsic expansions to the _finite versions in fast-math mode?
-Hal
----- Original Message -----
>
> Hi,
>
> I am trying to make clang generate code similar to gcc for the
> following function with –ffast-math option.
>
> #include <cmath>
>
> double foo(double val, double i) {
>
> double t = log(exp(val));
>
> return pow(t, i);
>
> }
>
>
>
> There are a few issues here.
>
> · Firstly, clang is not optimizing log(exp(val)) away into val.
>
> · Secondly, clang is calling an choosing an intrinsic of pow, instead
> of a library call to @__pow_finite
>
> %0 = tail call double @llvm.pow.f64(double %call1, double %i)
>
>
>
> I am wondering if there is a performance benefit in using
> @llvm.pow.f64, because at the end of the day, llvm generates code
> for x86 as follows:
>
> .cfi_def_cfa_offset 16
>
> movsd %xmm1, (%rsp) # 8-byte Spill
>
> callq __exp_finite
>
> callq __log_finite
>
> movsd (%rsp), %xmm1 # 8-byte Reload
>
> popq %rax
>
> jmp pow
>
>
>
> I believe __pow_finite is faster than pow. Please correct me if I am
> wrong.
>
>
>
> Thanks
>
> Ram
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
More information about the cfe-dev
mailing list