[llvm-commits] X86 FastISel: Emit immediate call arguments locally to save stack size when compiling with -O0

Ivan Krasin krasin at google.com
Thu Aug 11 09:42:50 PDT 2011


Sorry, I have forgot to include the summary.

SUM(AsmInstrs.Patch)	9274880
SUM(AsmInstrs.Mainline)	9587131
Delta	-3.26%
	
SUM(StackSpace.Patch)	40819064
SUM(StackSpace.Mainline)	41912400
Delta	-2.61%
	
SUM(WallSeconds.Patch)	49.2047
SUM(WallSeconds.Mainline)	49.5969
Delta	-0.79%
	
Ivan

On Thu, Aug 11, 2011 at 9:36 AM, Ivan Krasin <krasin at google.com> wrote:
> On Wed, Aug 10, 2011 at 4:16 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote:
>>
>> On Aug 10, 2011, at 4:01 PM, Ivan Krasin wrote:
>>
>>> Hi Jacob,
>>>
>>> here is the new version of the patch. Now, the local cache does not
>>> cross the function call (with the exception of intrinsics, since they
>>> tend to be inlined).
>>> I have found it hard to emit definition at last use w/o slowing down
>>> -O0 build (which is inapropriate). Please, find the patch attached.
>>
>> Eric, could you take a look at this patch when you have the time, please?
>>
>>> I have also ran my small benchmark tool over test-suite bitcode files.
>>> The full log is attached to the e-mail, below are some excerpts.
>>> first toolchain is the toolchain with the patch
>>> second toolchain is the unmodified llvm toolchain.
>>>
>>> The command line I used to run llc:
>>> Release/bin/llc -O0 -stats --time-passes -relocation-model=pic -O0
>>> -asm-verbose=false
>>
>> What architecture did you test?
> x86_64-unknown-linux-gnu
>
>>
>>> stdin/stdout/stderr were redirected to my tester. Every test has been
>>> ran once, so Seconds and WallSeconds could slightly differ if I ran
>>> the tool again.
>>
>> I'd rather not look through 700 K of raw data. Please provide a summary, including the worst regressions.
> https://spreadsheets.google.com/a/google.com/spreadsheet/ccc?key=0Ao3TPgpIlZ9ddHVpeDVCSzZHYXlJOTc0N1VKX1BFNVE&hl=en_US#gid=0
>
> There're few major regressions which I believe a show stopper for this
> patch (the top of the spreadsheet above):
>
> objinst.llvm.bc 272     344     0       0.0027  245     184     0       0.0026  11.02%  86.96%  0.10%
> cast.linked.bc  1053    1224    0       0.0058  924     664     0       0.0051  13.96%  84.34%  0.67%
> cast.llvm.bc    1053    1224    0       0.0057  926     680     0       0.005   13.71%  80.00%  0.67%
> objinst.linked.bc       320     360     0       0.0038  301     232     0       0.0036  6.31%   55.17%  0.19%
>
> The second column from right is stack size increase if patch is applied.
>
> All of these examples could be described as "one global constant is
> used across many calls in one function".
> Probably, the fix should be "don't apply an optimization if the block
> uses global variables", but I'm not sure if it would help too much.
>
> What do you think?
>
> Ivan
>>
>> /jakob
>>
>>
>




More information about the llvm-commits mailing list