[cfe-dev] Clang not generating pow_finite with -ffast-math

Hal Finkel hfinkel at anl.gov
Fri Sep 13 14:00:53 PDT 2013


Sriram,

I don't know why we do the pow intrinsic replacement; the semantics of the intrinsic are defined to be identical to the libm function call. It might just be something that has been that way for a long time, and now we should do something else.

This also brings up another issue, on the LLVM side of things, should be default the library-function intrinsic expansions to the _finite versions in fast-math mode?

 -Hal

----- Original Message -----
> 
> Hi,
> 
> I am trying to make clang generate code similar to gcc for the
> following function with –ffast-math option.
> 
> #include <cmath>
> 
> double foo(double val, double i) {
> 
> double t = log(exp(val));
> 
> return pow(t, i);
> 
> }
> 
> 
> 
> There are a few issues here.
> 
> · Firstly, clang is not optimizing log(exp(val)) away into val.
> 
> · Secondly, clang is calling an choosing an intrinsic of pow, instead
> of a library call to @__pow_finite
> 
> %0 = tail call double @llvm.pow.f64(double %call1, double %i)
> 
> 
> 
> I am wondering if there is a performance benefit in using
> @llvm.pow.f64, because at the end of the day, llvm generates code
> for x86 as follows:
> 
> .cfi_def_cfa_offset 16
> 
> movsd %xmm1, (%rsp) # 8-byte Spill
> 
> callq __exp_finite
> 
> callq __log_finite
> 
> movsd (%rsp), %xmm1 # 8-byte Reload
> 
> popq %rax
> 
> jmp pow
> 
> 
> 
> I believe __pow_finite is faster than pow. Please correct me if I am
> wrong.
> 
> 
> 
> Thanks
> 
> Ram
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory




More information about the cfe-dev mailing list