[PATCH] D30651: [InlineCost, -Oz] Don't take into account the penalty of a fast call of frequently used functions

Sat Mar 11 09:06:34 PST 2017

eastig added a comment.

In https://reviews.llvm.org/D30651#698341, @efriedma wrote:

> > I guess if it was not a loop GEP would have been folded into a load/store instruction. Anyway the cost calculation for GEP is not correct.
> >  Maybe instead of 'return true' it should be 'return isGEPFree(I);'. Then TTI::getGEPCost would return it is not free.
>
> Oh, the inliner has its own equivalent?  Anyway, that isn't really the point; the point is that we're concluding the GEP is free because we're assuming the users of the GEP are loads and stores, rather than PHI nodes/calls/etc.

My understanding of GEP was exactly as you wrote. I've browsed the LangRef and http://llvm.org/docs/GetElementPtr.html and I have not found anything prohibiting using GEP in such way. I think this representation of an increment of a pointer in a loop is better than:

  inc.offset.0 = phi(offset, inc.offset)
  inc.offset = inc.offset.0 + 1
  ptr = GEP(base, inc.offset.0)

However the last one gives the correct cost.
BTW I did an experiment when such GEPs are not free and the call penalty has lower values: 5 and 10. 5 means it's a cost of one instruction. 10 is a cost of two instructions. This model of the call penalty might be good for ARM where a call just saves PC into a register. So it's something like a sequence of  MOV + BR.
I didn't use the heuristic of many callers. The inline threshold value was not changed. It was 5.
The functions I didn't want to inline were not inlined. C++ benchmarks code size regressions improved but there are still some small regressions.

https://reviews.llvm.org/D30651