[PATCH] D24338: [InlineCost] Remove CallPenalty and change MinSizeThreshold to 5

David Li via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 4 14:16:31 PDT 2016


davidxl added inline comments.


================
Comment at: include/llvm/Analysis/TargetTransformInfo.h:188
+  /// and InlineCost changed to delegate to getCallCost.
+  int getInliningCallPenalty() const;
+  
----------------
chandlerc wrote:
> davidxl wrote:
> > I think it is a mistake to conflate two different things here. One is the penalty of 'not' inlining a callsite (aka the benefit of inlining it). This penalty models the branch(call)/return, function prologue and epilogue etc. It can also model the size impact of call instruction. The other is the overhead of a newly exposed callsite from the inline instance. The latter models the potential caller save register spills etc.  The former is the one that needs to defined in TTI.  For the latter, I think it should stay in Inliner proper. In the future, it can be replaced by target dependent register pressure analysis.  In other words, this interface should be getCallPenalty().  
> Sorry I'm a bit late to the thread, but I'm confused by this and some of the other comments.
> 
> There are definitely two concepts here: the cost of having a call in the instruction stream, and the expected "benefit" of inlining. But I don't think `CallPenalty` is really being used to model the latter concept these days. The *threshold* is what models the latter concept.
> 
> So originally, in early 2010, we started using `CallPenalty` scaled by the number of arguments to track the the "caller cost of having a call". Then later in 2010, we replaced this usage of `CallPenalty` with a new and completely unrelated usage to model the fact that "big" functions are "slow". Even though it was applied to a threshold that is primarily size based. Then a few years later we *completely changed how we even think about inline cost*. Since then, the idea of functions being either "big" or "slow" and modeling that with `CallPenalty`, IMO, no longer makes sense in the inline cost analysis *at all*. So I suspect that the current usage of `CallPenalty` is just wrong. If it is doing anything, it is augmenting the basic call cost to fix a failure for it to account for caller (size) cost of *having* a call instruction.
> 
> My only real concern with adding an API is understanding why we can't just make this what the existing `getCallCost` do exactly this. In fact, it already does most of this. Is there just some way we can adjust it to allow us to transition from `CallPenalty` to `getCallCost` to model the caller-side cost? If the name is too confusing, we could rename it `getCallerCallCost` or some such...
> 
> 
> However, I would *not* merge `getInliningThresholdMultiplier` into this. That is the target's tool to adjust the *threshold* which models the other thing you are talking about David. I agree they should remain separate. If we need more granular control of the threshold in a target, we should just add an API that has that granularity (with an understanding of why it is needed).
> 
> 
> And I still think it is reasonable to factor the caller cost *into* the threshold (after it is scaled), as it does model one aspect of the "benefit" of inlining. The threshold itself models the rest, and the multiplier can scale it appropriately.
I agree it is confusing.   My suggestion of handling runtime cost modeling in this patch actually makes it worse, so I take it back -- the runtime cost modeling part can be handled later in a different patch. So let's go back and focus on size modeling intended by Jame's original patch.

The fundamental problem Jame's noticed is that the cost (size) modeling of the original callsite (inline candidate) and the cost of introduced new site is not handled consistently. Example

caller () {
         callee ();    // to be inlined;
}
callee () {
     new_cs1();
     ..
}

When computing the cost of inlining callee(), the inline cost analysis first subtract the callsite cost -- but only partially: it subtracts the cost associated with parameter passing, but does not subtract other size penalty associated with the eliminated callsite (presumably higher than base instruction cost). However, when considering the cost of new_cs1(), it not only adds the parameter passing cost, but also the additional penalty.   Theoretically, if a simple wrapper function is inlined, there should be zero size impact, but in the current implementation, the wrapper function is blocked from being inlined.

In short, I think the simplest fix to the problem is to make the cost adjustment for the original and new callsites consistent -- it is very similar to the first version of the patch with the difference that CallPenalty is retained.

Regarding 'getCallCost' -- it computes the cost of both parameter passing and call itself. Ideally it should be used in inline cost adjustment, but unfortunately implemented slightly differently -- e.g., the parameter passing cost computation is less sophisticated than what inliner uses.


Repository:
  rL LLVM

https://reviews.llvm.org/D24338





More information about the llvm-commits mailing list