[cfe-dev] [LLVMdev] runtime performance benchmarking tools for clang
jyoti.yalamanchili at gmail.com
Wed Dec 11 23:12:06 PST 2013
Thanks for your reply.
We enabled -ffast-math which in turn adds -fno-math-errno to clang -cc1
which resulted in SQRT function being replaced with VSQRT instruction and
there was an improvement ~40% seen from before for some of the TC.
Still lag exists when compared to gcc though. We are investigating that
currently. Any pointers in this direction would help.
Could you suggest some benchmarks specifically for floating point ?
On Wed, Dec 11, 2013 at 11:28 PM, David Peixotto <dpeixott at codeaurora.org>wrote:
> 2) For lag in execution time due to floating point operations, it was
> clearly observed that gcc used floating point instruction FSQRT, where as
> clang seemed to use emulated function (?) BL SQRT.
> Note that we used the following flags for both clang as well as gcc
> -march=armv7-a -mfloat-abi=softfp -mfpu=vfpv3-d16 -mtune=cortex-a8
> Infact, i was surprised to see that even when " -march=armv7-a -mfloat-abi=
> *hard* -mfpu=vfpv3-d16 -mtune=cortex-a8"
> was used, the code generated did not use hardware *vsqrt* instruction,
> instead there was a *bl sqrt* instruction.
> Could someone point out why *vsqrt *was not emited in assembly even
> though softfp or 'hard' float-abi was specified ?
> The vsqrt instruction may not be generated when automatically for
> platforms where math functions may set errno. Try compiling with
> -fno-math-errno and see if that helps.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cfe-dev