[llvm-commits] [llvm] r153812 - in /llvm/trunk: include/llvm/Analysis/ include/llvm/Transforms/IPO/ lib/Analysis/ lib/Transforms/IPO/ test/Transforms/Inline/

Wed Apr 11 09:40:56 PDT 2012

On Apr 11, 2012, at 9:09 AM, Chandler Carruth wrote:

> On Wed, Apr 11, 2012 at 1:56 AM, Chandler Carruth <chandlerc at gmail.com> wrote:
> On Wed, Apr 11, 2012 at 12:31 AM, Chandler Carruth <chandlerc at gmail.com> wrote:
> I'll start looking for smoking guns right away though.
> 
> This looks very much like the previous cases where inliner changes caused compile-time regressions.
> 
> Looking at x86-64 of sqlite3, the profile with the trunk clang shows only 3.5% of all the time in the inline cost analysis. That's a bit higher than I would like (I've got some ideas to shrink it on two fronts that I will implement right away), it's not likely responsible for the near 10% regression your seeing; this function wasn't even free before.
> 
> However, I'm seeing time spread pretty well between: JumpThreading, the RA, CorrelatedValueProp, GVN, and InstCombine. This looks like increased kicking in of the host of scalar optimizations giving us a broad slight slowdown.
> 
> I'm still working on doing before/after profile comparisons and other things to see if I can tease out the culprit here.
> 
> I also see several places where we can recoup a few percent in all likelihood; I'll try to tackle those if I can.
> 
> Ok, thanks to Chad for helping me get set up to look at this. I've implemented the type of optimization that *should* help matters if the inline cost computation were the problem. I've attached the patch. It reduces the number of inline cost computations by 25% for sqlite3. I have a plan for how to make even more invasive changes to the inliner that could potentially save another 10% or so, but the alarming thing is that this patch has *zero* impact on the -O3 compile time of the sqlite3 bitcode. =/ However, if I tweak the inline cost computation to simple return higher costs, or to reject a larger percentage of the functions, I can immediately recoup all 9% regressions and a lot more.
> 
> As far as I can tell, this is a symptom of the new inline cost metric exposing more (good) inlining opportunities, and the scalar optimizations in LLVM taking advantage of them, and chewing on the code more. The unfortunate thing is that we're not getting any significant runtime improvements out of this (or are we?).

I saw improvements and regressions in both compile-time and execution-time.  Either Daniel or I will send you more comprehensive numbers off-list.

> 
> I think the only real solution is to work on making the various scalar optimizations less expensive. I see a few opportunities for this already after staring at the profile for a while. I'll try to look into those as I have time. =/ I wish I had a better answer here. Other ideas? Thoughts?

I completely agree.  No great ideas here, but feel free to file bug reports as you see opportunities for improvement..

> 
> I've attached the patch which caches some inline cost queries. As I said, it caches about 25% of them, at least on this test case. Even so, I'm not sure we should do it because it adds complexity and ugliness to the code. Let me know.
> <cache-callercallers.diff>

I'm fine with leaving the code as is and focusing on the other scalar optimizations.

 Chad
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120411/45ce163c/attachment.html>