[PATCH] D52716: [Inliner] Penalise inlining of calls with loops at Oz
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 2 11:38:44 PDT 2018
dmgreen updated this revision to Diff 167983.
dmgreen added a comment.
I've added a new memcpy test from the original reproducer. It's a byte memcpy (people seem to love writing those), which I think is worth focusing on because its small, but still increases codesize. It expands to:
b .LBB1_2
.LBB1_1:
ldrb r3, [r1], #1
subs r2, #1
strb r3, [r0], #1
.LBB1_2:
cmp r2, #0
bne .LBB1_1
bx lr
I would say from this that it's hard to argue the geps arn't free (unless we count that they are 32bit instructions? Do we do a lot of modelling of that?). They may be non-free on other architectures.
The llvm IR is 13 instuctions, 8 of which are simplified so 5 remain. Starting at a score of -45 (CallPenalty + Call + 3*Args), we end up at -20. We could add 1 for the branch, and another 2 (maybe) for the geps. That would get our score to -5, so we'd need another 2 instructions to cost something.
Or we'd need to not start out with such a negative score. In this case, I think the args make sense, the CallPenalty of 25 is a bit high maybe? As I understand it this number (for codesize) is essentially like changing our threshold from <=0 to <=25, and saying "more optimisations may happen, lets fudge it a bit". So changing it would mean less inlining in general (for minsize, as in it would affect more than just loops. It's a larger change.)
> any cost to materializing "32" in @call2.
Yeah, that was what I was thinking of for the setup cost of the phi's not being free. I see what you mean about operands remaining live after the phi. The same could be said for function arguments, right? They may be free if they can be put into the correct register, but wont be if they are also needed after the call.
> We could probably improve our handling of GEPs; for example, a GEP used by a loop PHI node is probably not free.
Exactly where would a gep be be free? Just loads and stores and other geps? Perhaps memcpy's if they are small enough? Or from other instructions in general?
I think that was a kind of long way of saying, if we are going to try to do this accurately, we should probably start at the CallPenalty. Let me know what you think, I'll try and get more numbers.
https://reviews.llvm.org/D52716
Files:
lib/Analysis/InlineCost.cpp
test/Transforms/Inline/ARM/loop-add.ll
test/Transforms/Inline/ARM/loop-memcpy.ll
test/Transforms/Inline/ARM/loop-noinline.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D52716.167983.patch
Type: text/x-patch
Size: 9376 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20181002/5d4d6581/attachment.bin>
More information about the llvm-commits
mailing list