[llvm-commits] X86 FastISel: Emit immediate call arguments locally to save stack size when compiling with -O0

Thu Aug 11 11:03:16 PDT 2011

On Thu, Aug 11, 2011 at 10:20 AM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote:
>
> On Aug 11, 2011, at 10:06 AM, Ivan Krasin wrote:
>
>> On Thu, Aug 11, 2011 at 9:50 AM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote:
>>> I don't understand how your patch can increase the stack space used. How does that happen? What is getting spilled?
>> Fast register allocator is too stupid.
>>
>> This is the code that allocates so much stack:
>>        movq    %rdx, 816(%rsp)
>>        movq    %rcx, %rdx
>>        movq    816(%rsp), %r8
>>
>> As you can see, it's equivalent of
>> movq %rdx, %r8
>> movq %rcx, %rdx
>
> Yep, RAFast will spill when it gets in trouble. I guess I don't understand why your patch would cause it to get in trouble. Is it because the constants are not materialized immediately before they are used?

It gets confused on this:

%vreg67<def> = COPY %EAX; GR32:%vreg67
	%vreg59<def> = LEA64r %RIP, 1, %noreg, <ga:@.str1>, %noreg; GR64:%vreg59
	%vreg60<def> = LEA64r %RIP, 1, %noreg, <ga:@.str8>, %noreg; GR64:%vreg60
	%vreg61<def> = LEA64r %RIP, 1, %noreg, <ga:@.str7>, %noreg; GR64:%vreg61
	%vreg62<def> = LEA64r %RIP, 1, %noreg, <ga:@.str9>, %noreg; GR64:%vreg62
	ADJCALLSTACKDOWN64 0, %RSP<imp-def>, %EFLAGS<imp-def>, %RSP<imp-use>
	%RDI<def> = COPY %vreg59; GR64:%vreg59
	%ESI<def> = COPY %vreg41; GR32:%vreg41
	%RDX<def> = COPY %vreg60; GR64:%vreg60
	%RCX<def> = COPY %vreg61; GR64:%vreg61
	%R8<def> = COPY %vreg62; GR64:%vreg62
	%R9D<def> = COPY %vreg41; GR32:%vreg41
	%AL<def> = MOV8ri 0
	CALL64pcrel32 <ga:@printf>[TF=6], %AL, %RDI, %ESI, %RDX, %RCX, %R8,
%R9D, %RAX<imp-def>, %RCX<imp-def,dead>, %RDX<imp-def,dead>,
%RSI<imp-def,dead>, %RDI<imp-def,dead>, %R8<imp-def,dead>,
%R9<imp-def,dead>, %EFLAGS<imp-def,dead>, %RSP<imp-use>, ...
	ADJCALLSTACKUP64 0, 0, %RSP<imp-def>, %EFLAGS<imp-def>, %RSP<imp-use>

This issue is unrelated to my patch (it's also have to use the stack
for that even now), but my patch makes the cost of this regalloc fail
too high.

I'm confused a little bit. What should I do:

- Commit this patch and improve fast regalloc
or
- Improve fast regalloc and resurrect this patch after that?

Ivan
>
>> Ok, it's stupid, but it could (at least) use the same stack slot for
>> these things! But fast regalloc does not use live intervals, so it
>> can't decide to reuse stack slots w/o an improvement.
>> Probably, fast regalloc should be improved as well. In this case, my
>> patch would not have regressions for stack size.
>
> Please file a PR for that.
>
> A primitive form of stack slot coloring would be to reuse stack slots for spilled local live ranges.
>
> RAFast doesn't actually know which live ranges are local until it has assigned the last use, but perhaps that could be improved.
>
> /jakob
>
>