[llvm-commits] X86 FastISel: Emit immediate call arguments locally to save stack size when compiling with -O0
Ivan Krasin
krasin at google.com
Thu Aug 11 10:06:54 PDT 2011
On Thu, Aug 11, 2011 at 9:50 AM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote:
>
> On Aug 11, 2011, at 9:36 AM, Ivan Krasin wrote:
>
>> https://spreadsheets.google.com/a/google.com/spreadsheet/ccc?key=0Ao3TPgpIlZ9ddHVpeDVCSzZHYXlJOTc0N1VKX1BFNVE&hl=en_US#gid=0
>
> Thanks. Those numbers look really good, except for:
>
>> There're few major regressions which I believe a show stopper for this
>> patch (the top of the spreadsheet above):
>>
>> objinst.llvm.bc 272 344 0 0.0027 245 184 0 0.0026 11.02% 86.96% 0.10%
>> cast.linked.bc 1053 1224 0 0.0058 924 664 0 0.0051 13.96% 84.34% 0.67%
>> cast.llvm.bc 1053 1224 0 0.0057 926 680 0 0.005 13.71% 80.00% 0.67%
>> objinst.linked.bc 320 360 0 0.0038 301 232 0 0.0036 6.31% 55.17% 0.19%
>>
>> The second column from right is stack size increase if patch is applied.
>>
>> All of these examples could be described as "one global constant is
>> used across many calls in one function".
>> Probably, the fix should be "don't apply an optimization if the block
>> uses global variables", but I'm not sure if it would help too much.
>
> I don't understand how your patch can increase the stack space used. How does that happen? What is getting spilled?
Fast register allocator is too stupid.
This is the code that allocates so much stack:
movq %rdx, 816(%rsp)
movq %rcx, %rdx
movq 816(%rsp), %r8
As you can see, it's equivalent of
movq %rdx, %r8
movq %rcx, %rdx
Ok, it's stupid, but it could (at least) use the same stack slot for
these things! But fast regalloc does not use live intervals, so it
can't decide to reuse stack slots w/o an improvement.
Probably, fast regalloc should be improved as well. In this case, my
patch would not have regressions for stack size.
>
> /jakob
>
>
More information about the llvm-commits
mailing list