[llvm-commits] Patch: Math Lib call optimization

Wed Nov 2 11:07:13 PDT 2011

Hello,

I worked on an LLVM patch to optimize mathematical library calls and would
like to submit it to your review.

A. Background:
In C89, most of the mathematical accept only the type "double" for the
floating-point arguments. 
Later on, in C99, this limitation was fixed by introducing new sets of
functions with "f" suffixes that accept "float" arguments.   Our experiments
show that, on ARM platform, the "float" type versions are significantly
faster than their double precision counterparts. For example, "float
sinf(float)" is 1.87 times faster than "double sin(double)".  

However, this new set of functions are not always exploited by programmers.
For example, a programmer may write:
void foo(float y)
  float x = sin(y);
  ...
}
instead of writing:
void foo(float y)
  float x = sinf(y);
  ...
}

B. This optimization:
This optimization looks for missed opportunities, in which a lighter
weighted function could be used without losing precision. 
For legitimation, the conversion is performed only if:
1) the arguments of the call are all defined by FP extension instructions.
For example, in the first function, for "sin(y)", "y" is implicitly extended
from "float" to "double"

2) the return value of the call is only used by a FP truncation instruction.
For example, in the first function, the return value of "sin(y)" is only
used by a truncation from "double" to "float" and then assigned to variable
"x".

Hence, this transformation will not result in precision loss.

C. Patch details (libcall_fp_version_opt.diff):
1) lib/Transforms/Scalar/SimplifyLibCalls.cpp:  we added a new pass, called
"ReplaceLibCallVersions", for this optimization. This pass inherits from the
existing SimplifyLibCalls pass and the concrete optimizing class also
inherits from the existing LibCallOptimization class. The reason we do not
merge our codes into the existing SimpliyLibCalls pass is that both pass
will optimize for some common lib calls (e.g. pow()). 

2) lib/Target/ARM/ARMTargetMachine.cpp: so far, we only tested and verified
this pass on ARM,  so we only invoke it for ARM, which is the same way as
Global-Merge pass does.  Please feel free to let us know if you want to make
it available for all architectures. 

3) test/CodeGen/ARM/libcallconv.ll: test cases for this optimization 

D. Failure report
Failure report from running llvm/test and projects/test-suite on ARM.
I noted failures running llvm/test (svn rev 143352) and projects/test-suite
(svn rev 142659) .

Thank you,
Weiming Zhao

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111102/fac08354/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: libcall_fp_version_opt.diff
Type: application/octet-stream
Size: 9623 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111102/fac08354/attachment.obj>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: failures.txt
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111102/fac08354/attachment.txt>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: libcallconv.ll
Type: application/octet-stream
Size: 946 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111102/fac08354/attachment-0001.obj>