[PATCH] D30651: [InlineCost, -Oz] Don't take into account the penalty of a fast call of frequently used functions
Evgeny Astigeevich via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Mar 11 09:06:34 PST 2017
eastig added a comment.
In https://reviews.llvm.org/D30651#698341, @efriedma wrote:
> > I guess if it was not a loop GEP would have been folded into a load/store instruction. Anyway the cost calculation for GEP is not correct.
> > Maybe instead of 'return true' it should be 'return isGEPFree(I);'. Then TTI::getGEPCost would return it is not free.
>
> Oh, the inliner has its own equivalent? Anyway, that isn't really the point; the point is that we're concluding the GEP is free because we're assuming the users of the GEP are loads and stores, rather than PHI nodes/calls/etc.
My understanding of GEP was exactly as you wrote. I've browsed the LangRef and http://llvm.org/docs/GetElementPtr.html and I have not found anything prohibiting using GEP in such way. I think this representation of an increment of a pointer in a loop is better than:
inc.offset.0 = phi(offset, inc.offset)
inc.offset = inc.offset.0 + 1
ptr = GEP(base, inc.offset.0)
However the last one gives the correct cost.
BTW I did an experiment when such GEPs are not free and the call penalty has lower values: 5 and 10. 5 means it's a cost of one instruction. 10 is a cost of two instructions. This model of the call penalty might be good for ARM where a call just saves PC into a register. So it's something like a sequence of MOV + BR.
I didn't use the heuristic of many callers. The inline threshold value was not changed. It was 5.
The functions I didn't want to inline were not inlined. C++ benchmarks code size regressions improved but there are still some small regressions.
https://reviews.llvm.org/D30651
More information about the llvm-commits
mailing list