[PATCH] D24338: [InlineCost] Remove CallPenalty and change MinSizeThreshold to 5

Tue Nov 1 23:40:57 PDT 2016

> On Nov 1, 2016, at 11:24 PM, Xinliang David Li <davidxl at google.com> wrote:
> 
> 
> 
> On Tue, Nov 1, 2016 at 10:39 PM, Mehdi Amini <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>> wrote:
> 
>> On Nov 1, 2016, at 10:01 PM, Xinliang David Li <davidxl at google.com <mailto:davidxl at google.com>> wrote:
>> 
>> 
>> 
>> On Tue, Nov 1, 2016 at 8:06 PM, Mehdi AMINI <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>> wrote:
>> mehdi_amini added a comment.
>> 
>> (Add inline the answer to David, so that there is the relevant context)
>> 
>> 
>> 
>> ================
>> Comment at: lib/Analysis/InlineCost.cpp:938
>> -      if (!isa<InlineAsm>(CS.getCalledValue()))
>> -        Cost += InlineConstants::CallPenalty;
>>      }
>> ----------------
>> >>! In D24338, @davidxl wrote:
>> > Do you mean parameter passing? That should be counted independently  by the instruction visitor.
>> 
>> No, I don’t mean the parameters, the visitor account for one instruction per parameter just before this indeed.
>> However a call has other extra-cost on top of a regular instruction, even without any argument: it splits live-ranges and may cause spills around it to preserve registers that aren't callee-saved.
>> 
>> In this sense, it make sense that a call can "cost" more than a normal instruction.
>> 
>> yes, inlining can lead to increased register pressure leading to use of caller save registers (and spills around new callsites). However I think this should not be blindly applied -- better tied with register pressure analysis.
> 
> I’m not sure we’re talking on the same level here: I don’t claim the the register pressure will increase in the caller after inlining here, or at least not because of call instructions in the callee.
> My assumption (possibly incorrect), is that the current inliner cost model trades off the cost of the call site with the increased code size resulting from the inlining.
> 
> When we consider the overall “cost” of a callee: we a cost per-instruction in the visitor. The question is what is this cost representing? From a code size point of view, it make sense to add a penalty for call instructions because they have potentially more impacts (spills is an example that is specific to calls).
> 
> From the way it is currently implemented, it is likely the cost is to model the size increase. Yes, spills can result in size increase too -- but my point is that size cost of spills can not be unconditionally applied.  
>  
> To compute runtime cost/benefit, the 'runtime cost diff's of the callee and inline instance of the callee needs to be computed (with the help of profile info). Easwaran has a speed-up analysis patch that does that.
> 
> I don’t believe the patch as-is demonstrate that removing this penalty is the right thing to do: I’d like to see first how it compares to:
> 1) simply starting by deducing the cost of one call (the call-site considered for inlining) ;
> it is included in this patch, but should be split and committed first: that’s mostly a “bug fix”.
> 
> this sounds reasonable.
>  
> 2) simply increasing the threshold.
> 
> This is a different tuning IMO.

I agree. The way I see this is some sort of “smoke test” for the measurement: if a minor bump to the threshold provides very similar results, you may question the conclusion based on these.

>  
> Without these additional baseline, it is not clear to me that the measurement is sound: we can very well obtain better result because we will now inline a function that has multiple calls, while there can exist similar function without many calls but a bit more arithmetic that would also be profitable to inline.
> 
> Or do this: split the current CallPenalty into two parameters:
> * CallPenalty which is used in patch  #1 you suggested -- i.e, the value to be deducted from the cost of  callsite to be inlined
> * CallInstrCost -- this value is used to model the 'size cost' of call instruction (which considers potential spilling code). The value can be tuned independently -- lowed below 25 but higher than 5 (the default instruction cost).

Yes, this is perfectly matching the model I have in mind.

Thanks!

— 
Mehdi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161101/81d7300f/attachment.html>