[llvm-commits] X86 FastISel: Emit immediate call arguments locally to save stack size when compiling with -O0
Jakob Stoklund Olesen
stoklund at 2pi.dk
Tue Aug 2 18:03:55 PDT 2011
On Aug 2, 2011, at 5:43 PM, Ivan Krasin wrote:
>> Here is how fast isel currently works: It emits instructions bottom-up. When it needs a constant, the code for the constant is emitted to the top of the block, and it makes a note that the constant is now in a virtual register. Further uses of that constant simply get the virtual register.
>>
>> Here is what you should do: When you need a constant, don't emit the code immediately, but do allocate a virtual register and make a note that the constant needs to be materialized. Whenever you are about to emit a call, materialize all the pending constants. Then insert the call instruction.
>>
>> That way, virtual registers holding constants will never cross a call instruction, so they won't get spilled (much).
> OK, I've got your point and I like it.
Cool.
It is probably a good idea to keep track of the first instruction (i.e. last visited) using each constant. That is the best place to materialize the virtual register.
>> For patches like this, please provide measurements of code size and compile time as well.
> I agree that metrics are the key to optimizing the code. Here is my
> metrics: time and stack used.
>
> /usr/bin/time -f '%U' Release/bin/llc jsinterp.bc -march=x86-64 -O0
> 0.42
This is good, but you should probably time the integrated assembler with -filetype=obj, and definitely -asm-verbose=false. You can also use 'llc -time-passes' to get per-pass timing.
Your changes here affect the isel pass, of course, but also regalloc and asm-writer.
> I calculate the amount of stack needed (thx to Rafael for the
> suggestion) with the following script:
>
> krasin at krasin$ cat calc_stack.sh
Neat. This is a good metric.
I wonder if you could add a statistic to PrologEpilogInserter.cpp instead? That would provide a target independent stack usage metric with 'llc -stats'.
> Jakob, are you fine with the proposed metrics?
Yes, but please measure generated code size as well. On Darwin, 'size -m' will tell you the size of the __text segment in bytes. Linux probably has something similar. Be careful, some tools rounds the sizes up to the nearest page size.
Make sure you measure code size with -relocation-model=pic. The address of a global is a constant that may require multiple instructions to materialize.
LLVM's nightly test suite should provide you with plenty of bitcode examples.
/jakob
More information about the llvm-commits
mailing list