[llvm-commits] PATCH: Followup to r152556: Use more powerful instsimplify when actually performing the inline of a function
Chandler Carruth
chandlerc at gmail.com
Mon Mar 19 19:41:17 PDT 2012
Chris had great feedback on my patch in r152556: it isn't very general in
its approach. Why don't we fold these these constants during inlining
already?
The inliner *does* use the constant folder, but that constant folder can't
catch a lot of the cases we care about, and so it gives up and leaves the
unsimplified results. This is despite the fact that simplification is
indeed possible.
My original patch papers over this by running the more powerful
instsimplify analysis over *just* the callsite arguments before subsequent
runs of inlining. This helps those cases but leaves out some other
important cases:
0) my original issue of constant folding pointer-differences with
constant-related pointers
1) the cases Chris mentioned where there is more complicated math involved
2) cases where we can delete *entire blocks* of code due to the
simplifications
3) cases where we can avoid the expensive cloning step to produce values
identical to values that already exist: (gep x, 0) or (or x 0) for example
Here is an attempt to fix this by using the more powerful instruction
simplification routine *inside* the inliner. This hooks directly into its
existing value-map-based override system to minimize the cloning, propagate
both constants and simplified values directly when inlining, etc. It should
also more aggressively prune the set of basic blocks cloned during
inlining. The gross part is that I had to extend the interfaces into
SimplifyInstruction, but I think the result is a reasonable compromise and
exposes more information to the simplification pass. Also, I had to enhance
it to survive cases where the instructions in question are not part of well
formed basic blocks or functions.
So far, it survives the regression tests (including those for the original
patch I submitted) and a bootstrap but I still need to write up and add
regression tests for 1, 2, and 3 above (if I can). The resulting binary is
marginally smaller (<1%) and I see no significant performance changes in my
initial testing. I still need to test the optimizers performance to make
sure we don't inflate the inliner's cost significantly, but it looks
promising thus far.
Comments? Is this the right approach?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120319/9353ef60/attachment.html>
More information about the llvm-commits
mailing list