[PATCH] Inliner Enhancement
liujiangning1 at gmail.com
Thu Mar 19 19:34:41 PDT 2015
Comparing even more complicated algorithm, yes it is simple, but comparing
>> with the original solution, it can avoid code bloat of abnormal programs
>> like repeatedly calling the same function a lot of times.
> so this heuristic is to reduce cold bloat?
If you read my initial post again, you would have noticed it is from (2.b),
which is for code size.
> Adding simple heuristic is easy, the problem with them (without clear
> benefit, cost analysis) is that you can easily find a counter example (low
> SNR). Those heuristic are common for compilers that do benchmark hacks.
I don't think so. From engineering point of view, the complicated code is
hard to be maintained and further developed. Also, even if more complicated
logic can obtain some more benefit for some specific benchmark, I still
don't think it is a good practice, because it would be likely to leave a
lot of corner cases or holes exposing counter example to other engineers.
Yes, my heuristic rules are simple, so why can't we just take it if it is a
low hanging fruit? I have specific cases showing the similar threshold
enlarging can bring different performance impact significantly for
different benchmarks. For one case, if you enlarge threshold to be above A,
you could get ~5% performance improvement, but for another case if you
further enlarge it to be up to B, you would have 3~5% regression. The
regression is just because the increase of register spilling code. At
inlining time, it's hard to give accurate register cost model, even if you
say you can capture some for some cases. It would be the same argument that
it would be easy to find counter case for more complicated algorithm. For
the example I gave (Sorry, I can't share with you the details because of
SPEC license issue), first it's hard to guess which callee is hot, and
second even if you know both are hot, it would be still hard to decide your
threshold should be between A and B. Therefore, now the question is you
want to capture this case or not?
>> I'm not sure what those more sophisticated heuristic rules you are
>> talking about are, but if you tends to tune them for specific benchmark, I
>> would say it will definitely not make sense. I would be appreciative if you
>> can share your result with community. Then we can see how we
>> can reasonably move on.
> The plan is to tune inliner (both non-PGO and PGO) for real world programs
> and cross-validate with SPEC benchmarks. This work will definitely be
> shared with community.
Given that it is still a plan for you, how can you justify it without data?
If you still think your idea can be better, I think it can be the 2nd step
following this simple heuristic.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-commits