[LLVMdev] runtime performance benchmarking tools for clang

Wed Dec 11 09:58:10 PST 2013

2) For lag in execution time due to floating point operations, it was
clearly observed that gcc used floating point instruction FSQRT, where as
clang seemed to use emulated function (?) BL SQRT.

Note that we used the following flags for both clang as well as gcc
compilation.

-march=armv7-a -mfloat-abi=softfp -mfpu=vfpv3-d16 -mtune=cortex-a8

Infact, i was surprised to see that even when " -march=armv7-a
-mfloat-abi=hard -mfpu=vfpv3-d16 -mtune=cortex-a8" 

was used, the code generated did not use hardware vsqrt instruction, instead
there was a bl sqrt instruction.

Could someone point out why vsqrt was not emited in assembly even though
softfp or 'hard' float-abi was specified ?

The vsqrt instruction may not be generated when automatically for platforms
where math functions may set errno. Try compiling with -fno-math-errno and
see if that helps.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131211/44cefac6/attachment.html>